[1]郭雨萌,李国正.一种多标记数据的过滤式特征选择框架[J].智能系统学报,2014,9(3):292-297.[doi:10.3969/j.issn.1673-4785.201403064]
GUO Yumeng,LI Guozheng.A filtering framework for the multi-label feature selection[J].CAAI Transactions on Intelligent Systems,2014,9(3):292-297.[doi:10.3969/j.issn.1673-4785.201403064]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
9
期数:
2014年第3期
页码:
292-297
栏目:
学术论文—人工智能基础
出版日期:
2014-06-25
- Title:
-
A filtering framework for the multi-label feature selection
- 作者:
-
郭雨萌, 李国正
-
同济大学 电子与信息工程学院控制系, 上海 201804
- Author(s):
-
GUO Yumeng, LI Guozheng
-
School of Electronic and Information Engineering, Tongji University, Shanghai 201804, China
-
- 关键词:
-
特征选择; 多标记; 过滤式; 卡方检验
- Keywords:
-
feature selection; multi-label; filter; CHI-square test
- 分类号:
-
TP391
- DOI:
-
10.3969/j.issn.1673-4785.201403064
- 摘要:
-
提出一种过滤式的多标记数据特征选择框架, 并在卡方检验基础上进行实现和实验研究。该框架计算每个特征在各个类标上的卡方检验, 然后通过得分的统计值计算出每个特征的最终排序情况, 选取了最大、平均、最小3种统计值分别进行了实验比较。在5个评价指标、4个常用的多标记数据集和3个学习器上的对比实验表明, 3种得分统计方式各有优劣, 但都能提高多标记学习的效果。
- Abstract:
-
The researchers of multi-label learning mainly focus on the classifier performance, regardless of the influence of the dataset feature. This paper proposes a filter framework of the multi-labeled data feature selection. The algorithm implementation and experiment were carried out based on the Chi-square test. This framework calculates the CHI-square test for each feature on each label, and then the ranking order of each feature is computed by the statistics of the score. This paper considers three different types of statistical data (average, maximum, minimum) for the experimental comparisons. The contrasting experiments with the four common multi-label datasets with three classifiers and five evaluation criteria show that these three score statistical methods share both superior and inferior characteristics, but still improve the performance for multi-label learning problems.
备注/Memo
收稿日期:2014-03-25。
基金项目:国家自然科学基金资助项目(61273305)
作者简介:郭雨萌,男,1989年生,博士研究生,主要研究方向为模式识别与机器学习等。
通讯作者:李国正,男,1977年生,研究员,博士生导师,博士,中国人工智能学会机器学习专业委员会常务委员,主要研究方向为模式识别和生物医学数据挖掘,在研和完成国家自然科学基金项目、上海市科委"创新行动计划"重大项目子课题等多项课题,发表学术论文100余篇,其中SCI检索40余篇,EI检索50余篇,参与撰写专著6部,主持翻译专著1部,E-mail:gzli@tongji.edu.cn。
更新日期/Last Update:
1900-01-01