[1]JI Ruohan,DONG Hongbin.Feature selection using forest optimization algorithm based on duplication analysis[J].CAAI Transactions on Intelligent Systems,2022,17(6):1113-1122.[doi:10.11992/tis.202111060]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
17
Number of periods:
2022 6
Page number:
1113-1122
Column:
学术论文—机器学习
Public date:
2022-11-05
- Title:
-
Feature selection using forest optimization algorithm based on duplication analysis
- Author(s):
-
JI Ruohan; DONG Hongbin
-
School of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
-
- Keywords:
-
feature selection; evolutionary algorithm; duplication analysis; information entropy; information gain; restart mechanism; forest optimization algorithm; dimensionality reduction
- CLC:
-
TP301
- DOI:
-
10.11992/tis.202111060
- Abstract:
-
The forest optimization algorithm is an evolutionary algorithm based on the concept of forest tree planting. It has a strong capability for searching for feature space and low implementation difficulty. However, the algorithm still has room for improvement in the convergence speed and merit-seeking ability of the forest as a whole, and it is not well-suited to high dimensional data sets. In this paper, we propose to use a forest optimization algorithm based on duplication analysis (DAFSFOA) to address the above problems. The algorithm proposes an adaptive initialization strategy based on information gain, a forest repetition analysis mechanism, a forest restart mechanism, a candidate optimal tree generation strategy, and an adaptation function that integrates the number of feature selections and the correct classification rate. The experimental results show that DAFSFOA achieves the highest classification accuracy on most datasets. Meanwhile, for the high dimensional dataset SRBCT, DAFSFOA has a large improvement over feature selection using a forest optimization algorithm (FSFOA) in terms of dimensionality reduction rate and classification accuracy. DAFSFOA has a stronger feature space exploration capability than FSFOA and can adapt to datasets with different dimensions.