[1]LI Jianyu,ZHAN Zhihui.A self-supervised data-driven particle swarm optimization approach for large-scale feature selection[J].CAAI Transactions on Intelligent Systems,2023,18(1):194-206.[doi:10.11992/tis.202206008]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
18
Number of periods:
2023 1
Page number:
194-206
Column:
吴文俊人工智能科学技术奖论坛
Public date:
2023-01-05
- Title:
-
A self-supervised data-driven particle swarm optimization approach for large-scale feature selection
- Author(s):
-
LI Jianyu; ZHAN Zhihui
-
School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China
-
- Keywords:
-
feature selection; large-scale optimization; particle swarm optimization; evolutionary computation; swarm intelligence; data-driven; self-supervised learning; discrete region encoding
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202206008
- Abstract:
-
Large-scale feature selection problems usually face two challenges: 1) Real labels are insufficient for guiding the algorithm to select features, and 2) a large-scale search space encumbers the search for a satisfactory high-quality solution. To this end, in this paper, a novel self-supervised data-driven particle swarm optimization algorithm is proposed for large-scale feature selection, including three contributions. First, a novel algorithmic framework named self-supervised data-driven feature selection is proposed, which can perform the feature selection without real labels. Second, a discrete region encoding-based search strategy is proposed, which helps the algorithm to find better solutions in a large-scale search space. Third, based on the above framework and method, a self-supervised data-driven particle swarm optimization algorithm is proposed to solve the large-scale feature selection problem. Experimental results on datasets with large-scale features show that the proposed algorithm performs comparably to the mainstream supervised algorithms and has higher feature selection efficiency than state-of-the-art unsupervised algorithms.