当前位置>主页 > 期刊在线 > 信息技术 >

信息技术22年8期

面向目标识别的特征聚类与选择方法研究
桂洪冠¹,位凯²
(1. 达而观信息科技(上海)有限公司,上海 201203;2. 上海海事大学,上海 200135)

摘  要:近年来,目标识别已在众多领域广泛应用。如何有效地提高目标的识别精度是目前研究的一大热点。文章提出一种对未知目标和知识图谱中已有目标进行有效关联和选择的方法,旨在对目标进行精准识别。该方法首先使用一种基于连通性的联合非负矩阵分解算法对两个数据集进行特征聚类,将显著特征放入共表达模块中。然后使用一种多任务稀疏典型相关分析方法对共表达模块中的元素进行关联,挖掘具有较强相关性的关联关系。其中,权重较高的 Top 特征可用于后续的分类研究。


关键词:特征选择;矩阵分解;典型相关分析;线性回归模型;多任务框架;知识图谱



DOI:10.19850/j.cnki.2096-4706.2022.17.007


中图分类号:TP391.1                                   文献标识码:A                                  文章编号:2096-4706(2022)17-0029-05


Research on Feature Clustering and Selection Method for Target Recognition

GUI Honguan1, WEI Kai 2

(1.Data Grand Information Technology (Shanghai) Co., Ltd., Shanghai 201203, China; 2.Shanghai Maritime University, Shanghai 200135, China)

Abstract: In recent years, target recognition has been widely used in many fields. How to effectively improve the accuracy of target recognition is a hot research topic. This paper proposes a method for effectively associating and selecting unknown targets and existing targets in knowledge graph, aiming at accurate target recognition. In this method, a joint nonnegative matrix decomposition algorithm based on connectivity is used to cluster the features of the two data sets, and the salient features are put into the co expression module. Then a multi task sparse canonical correlation analysis method is used to associate the elements in the co expression module, and mining the association relations with strong correlation. Among them, the Top feature with higher weight can be used for subsequent classification research.

Keywords: feature selecting; matrix factorization; canonical correlation analysis; linear regression model; multi-task framework;knowledge graph


参考文献:

[1] DENG J,ZENG W M,KONG W,et al. Multi-Constrained Joint Non-Negative Matrix Factorization With Application to Imaging Genomic Study of Lung Metastasis in Soft Tissue Sarcomas [J].IEEE Transactions on Biomedical Engineering,2020,67(7):2110-2118.

[2] FANG J,LIN D D,SCHULZ S C,et al. Joint sparse canonical correlation analysis for detecting differential imaging genetics modules [J].Bioinformatics,2016,15(32):3480-3488.

[3] DENG J,ZENG W M,LUO S Z,et al. Integrating multiple genomic imaging data for the study of lung metas-tasis in sarcomas using multi-dimensional constrained joint non-negative matrix factorization [J]. Information Sciences,2021,576:24-36. 

[4] DU L,LIU K F,YAO X H,et al. Detecting genetic associations with brain imaging phenotypes in Alzheimer’s disease via a novel structured SCCA approach [J/OL].Medical Image Analysis, 2020,61:[2022-06-03].https://www.sciencedirect.com/science/article/ abs/pii/S1361841520300232?via%3Dihub.

[5] PENG P,ZHANG Y P,JU Y F,et al. Group Sparse Joint Non-negative Matrix Factorization on Orthog-onal Subspace for Multimodal Imaging Genetics Data Analysis [J].IEEE/ACM transactions on computational biology and bioinformatics,2022,19(1):479-490. 

[6] DU L,LIU K F,YAO X H,et al. Multi-Task Sparse Canonical Correlation Analysis with Application to Multi-Modal Brain Imaging Genetics [J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2021,18(1):227-239.

[7] ZAIDI N A,WEBB G I. A Fast Trust-Region Newton Method for Softmax Logistic Regression [C]//Proceedings of the 2017 SIAM International Conference on Data Mining (SDM).Houston:SIAM, 2017,705-713.


作者简介:桂洪冠(1977—),男,汉族,江苏淮安人,工程师,硕士,研究方向:自然语言处理、搜索引擎;位凯(1997—),男,汉族,黑龙江哈尔滨人,助理工程师,硕士,研究方向:生物信息学。