[1]胡军,王海峰.基于加权信息粒化的多标记数据特征选择算法[J].智能系统学报,2023,18(3):619-628.[doi:10.11992/tis.202111058]
HU Jun,WANG Haifeng.Feature selection algorithm of multi-labeled data based on weighted information granulation[J].CAAI Transactions on Intelligent Systems,2023,18(3):619-628.[doi:10.11992/tis.202111058]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
18
期数:
2023年第3期
页码:
619-628
栏目:
学术论文—人工智能基础
出版日期:
2023-07-05
- Title:
-
Feature selection algorithm of multi-labeled data based on weighted information granulation
- 作者:
-
胡军1,2, 王海峰1,2
-
1. 重庆邮电大学 计算机科学与技术学院, 重庆 400065;
2. 重庆邮电大学 计算智能重庆市重点实验室, 重庆 400065
- Author(s):
-
HU Jun1,2, WANG Haifeng1,2
-
1. College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China;
2. Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
-
- 关键词:
-
邻域粗糙集; 信息粒化; 多标记学习; 标记重要性; 标记关系; 特征权重; 特征选择; 谱聚类
- Keywords:
-
neighborhood rough set; information granulation; multi-label learning; label significance; label relationship; feature weight; feature selection; spectral clustering
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202111058
- 摘要:
-
特征选择能去除不相关和冗余的特征,是解决多标记数据维度灾难的有效工具。现有的多标记特征选择算法没有考虑标记空间存在的相关性,认为每个样本的相关标记的重要性相同,并且忽略了特征空间可能是标记重要性差异形成的内在因素,使得选择的特征不能精确全面地刻画样本且计算过程复杂。为此,本文利用标记间的相关性对标记空间进行划分以简化计算,并定义标记重要性度量和特征权重,在此基础上提出了一种基于加权信息粒化的多标记特征选择算法。通过在真实多标记数据集上的实验对比分析,本文提出的算法在各项评价指标上均优于其他对比算法,验证了算法的有效性和可行性。
- Abstract:
-
Feature selection can remove irrelevant and redundant features. It is an efficient tool to solve the disaster of multi-labeled data dimensions. Existing multi-labeled feature selection algorithms did not take the correlation of label space into account, and considered that the relevant labels of each sample have the same importance, and ignored that the feature space may be the internal factor caused by the difference of label importance, so that the selected features can not accurately and comprehensively describe the samples and the calculation process is very complex. In this paper, the correlation between labels is used to divide the label space to simplify the calculation. Then, the label importance measure and feature weight are defined. And further, a feature selection algorithm of multi-label data based on weighted information granulation is proposed. The comparison and analysis on real multi-labeled data set of experiment show that the proposed algorithm is superior to other comparison algorithms in all evaluation indicators, which verifies effectiveness and feasibility of the algorithm.
备注/Memo
收稿日期:2021-11-30。
基金项目:国家自然科学基金项目(61936001,62276038);重庆市自然科学基金项目(cstc2019jcyj-cxttX0002,cstc2021ycjh-bgzxm0013);重庆市教委重点合作项目(HZ2021008).
作者简介:胡军,教授,博士,主要研究方向为多粒度认知计算、人工智能安全和图分析与挖掘,近年来主持参与国家重点研发计划、国家自然科学基金、重庆市自然科学基金等科研项目10多项,授权国家发明专利5项,发表科学研究论文60多篇,出版专著3部;王海峰,硕士研究生,主要研究方向为粒计算、粗糙集
通讯作者:胡军.E-mail:hujun@cqupt.edu.cn
更新日期/Last Update:
1900-01-01