[1]胡敏杰,林耀进,杨红和,等.基于特征相关的谱特征选择算法[J].智能系统学报,2017,12(4):519-525.[doi:10.11992/tis.201609008]
HU Minjie,LIN Yaojin,YANG Honghe,et al.Spectral feature selection based on feature correlation[J].CAAI Transactions on Intelligent Systems,2017,12(4):519-525.[doi:10.11992/tis.201609008]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
12
期数:
2017年第4期
页码:
519-525
栏目:
学术论文—机器学习
出版日期:
2017-08-25
- Title:
-
Spectral feature selection based on feature correlation
- 作者:
-
胡敏杰, 林耀进, 杨红和, 郑荔平, 傅为
-
闽南师范大学 计算机学院, 福建 漳州 363000
- Author(s):
-
HU Minjie, LIN Yaojin, YANG Honghe, ZHENG Liping, FU Wei
-
School of Computer Science, Minnan Normal University, Zhangzhou 363000, China
-
- 关键词:
-
特征选择; 谱特征选择; 谱图理论; 特征关联; 区分能力; 索搜策略; 拉普拉斯; 分类精度
- Keywords:
-
feature selection; spectral feature selection; spectral graph theory; feature relevance; discernibility; search strategy; Laplacian score; classification performance
- 分类号:
-
TP18
- DOI:
-
10.11992/tis.201609008
- 摘要:
-
针对传统的谱特征选择算法只考虑单特征的重要性,将特征之间的统计相关性引入到传统谱分析中,构造了基于特征相关的谱特征选择模型。首先利用Laplacian Score找出最核心的一个特征作为已选特征,然后设计了新的特征组区分能力目标函数,采用前向贪心搜索策略依次评价候选特征,并选中使目标函数最小的候选特征加入到已选特征。该算法不仅考虑了特征重要性,而且充分考虑了特征之间的关联性,最后在2个不同分类器和8个UCI数据集上的实验结果表明:该算法不仅提高了特征子集的分类性能,而且获得较高的分类精度下所需特征子集的数量较少。
- Abstract:
-
In the traditional spectrum feature selection algorithm, only the importance of single features are considered. In this paper, we introduce the statistical correlation between features into traditional spectrum analysis and construct a spectral feature selection model based on feature correlation. First, the proposed model utilizes the Laplacian Score to identify the most central feature as the selected feature, then designs a new feature group discernibility objective function, and applies the forward greedy search strategy to sequentially evaluate the candidate features. Then, the candidate feature with the minimum objective function is added to the selected features. The algorithm considers both the importance of feature as well as the correlations between features. We conducted experiments on two different classifiers and eight UCI datasets, the results of which show that the algorithm effectively improves the classification performance of the feature subset and also obtains a small number of feature subsets with high classification precision.
备注/Memo
收稿日期:2016-09-08。
基金项目:国家自然科学基金项目(61303131,61379021); 福建省教育厅科技项目(JA14192);
作者简介:胡敏杰,女,1979年生,讲师,主要研究方向为数据挖掘;林耀进,男,1980年生,主要研究方向为数据挖掘、粒计算。主持国家自然科学基金2项。发表学术论文60余篇;杨红和,男,1969生,高级实验师,主要研究方向为数字校园。
通讯作者:胡敏杰,E-mail:zzhuminjie@sina.com.
更新日期/Last Update:
2017-08-25