[1]ZHAI Junhai,LIU Bo,ZHANG Sufang.A feature selection approach based on rough set relative classification information entropy and particle swarm optimization[J].CAAI Transactions on Intelligent Systems,2017,12(3):397-404.[doi:10.11992/tis.201705004]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
12
Number of periods:
2017 3
Page number:
397-404
Column:
学术论文—知识工程
Public date:
2017-06-25
- Title:
-
A feature selection approach based on rough set relative classification information entropy and particle swarm optimization
- Author(s):
-
ZHAI Junhai1; 2; LIU Bo3; ZHANG Sufang4
-
1. Key Lab of Machine Learning and Computational Intelligence, Hebei University, Baoding 071002, China;
2. College of Mathematics, Physics and Information Engineering, Zhejiang Normal University, Jinhua 321004, China;
3. College of Computer Science and Technology, Hebei University, Baoding 071002, China;
4. Hebei Branch of Meteorological Cadres Training Institute, China Meteorological Administration, Baoding 071000, China
-
- Keywords:
-
data mining; feature selection; data preprocessing; rough set; decision table; particle swarm optimization; information entropy; fitness function
- CLC:
-
TP181
- DOI:
-
10.11992/tis.201705004
- Abstract:
-
Feature selection, an important step in data mining, is a process that selects a subset from an original feature set based on some criteria. Its purpose is to reduce the computational complexity of the learning algorithm and to improve the performance of data mining by removing irrelevant and redundant features. To deal with the problem of discrete values, a feature selection approach was proposed in this paper. It uses a particle swarm optimization algorithm to search the optimal feature subset. Further, it employs relative classification information entropy as a fitness function to measure the significance of the feature subset. Then, the proposed approach was compared with other evolutionary algorithm-based methods of feature selection. The experimental results confirm that the proposed approach outperforms genetic algorithm-based methods.