[1]李顺勇,王改变.一种新的最大相关最小冗余特征选择算法[J].智能系统学报,2021,16(4):649-661.[doi:10.11992/tis.202009016]
LI Shunyong,WANG Gaibian.New MRMR feature selection algorithm[J].CAAI Transactions on Intelligent Systems,2021,16(4):649-661.[doi:10.11992/tis.202009016]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
16
期数:
2021年第4期
页码:
649-661
栏目:
学术论文—机器学习
出版日期:
2021-07-05
- Title:
-
New MRMR feature selection algorithm
- 作者:
-
李顺勇, 王改变
-
山西大学 数学科学学院,山西 太原 030006
- Author(s):
-
LI Shunyong, WANG Gaibian
-
School of Mathematical Sciences, Shanxi University, Taiyuan 030006, China
-
- 关键词:
-
特征选择; 冗余度; 相关度; 降维; 分类; 分类正确率; 支持向量机; T检验
- Keywords:
-
feature selection; redundancy; relevance; dimension reduction; classification; classification accuracy; support vector machines; T-test
- 分类号:
-
TP181
- DOI:
-
10.11992/tis.202009016
- 摘要:
-
传统的基于特征选择的分类算法中,由于其采用的冗余度和相关度评价标准单一,从而使得此类算法应用范围受限。针对这个问题,本文提出一种新的最大相关最小冗余特征选择算法,该算法在度量特征之间冗余度的评价准则中引入了两种不同的评价准则;在度量特征与类别之间的相关度中引入了4种不同的评价准则,衍生出8种不同的特征选择算法,从而使得该算法应用范围增大。此外,由于传统的最大相关最小冗余特征选择算法不能根据用户实际需求的数据维度进行特征选择。所以,引入了指示向量 $\lambda $ 来刻画用户实际的数据维度需求,提出了一种新的目标函数来求解最优特征子集,利用支持向量机对4个UCI数据集的特征子集进行了实验,最后,利用分类正确率、成对单边T检验充分验证了该算法的有效性。
- Abstract:
-
The application scopes of traditional classification algorithms based on feature selection are limited due to the single evaluation criteria of redundancy and relevance adopted. To solve this problem, this paper proposes a new maximum relevance, minimum redundancy (MRMR) feature selection algorithm, which enlarges its application scope by introducing two different evaluation criteria to measure the redundancy between features of measurement, measuring the correlation between features and categories, and deriving eight different feature selection algorithms. In addition, because the traditional MRMR feature selection algorithms cannot realize feature selection according to the data dimension of users’ actual demand, the study also applies an indicator vector $\lambda$ to achieve that, proposes a new objective function to obtain the optimal feature subset, and conducts experiments on four feature subsets of UCI using a support vector machine. Finally, the study verifies the effectiveness of the algorithm using classification accuracy and pairs of unilateral T-tests.
更新日期/Last Update:
1900-01-01