[1]黎建宇,詹志辉.面向大规模特征选择的自监督数据驱动粒子群优化算法[J].智能系统学报,2023,18(1):194-206.[doi:10.11992/tis.202206008]
LI Jianyu,ZHAN Zhihui.A self-supervised data-driven particle swarm optimization approach for large-scale feature selection[J].CAAI Transactions on Intelligent Systems,2023,18(1):194-206.[doi:10.11992/tis.202206008]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
18
期数:
2023年第1期
页码:
194-206
栏目:
吴文俊人工智能科学技术奖论坛
出版日期:
2023-01-05
- Title:
-
A self-supervised data-driven particle swarm optimization approach for large-scale feature selection
- 作者:
-
黎建宇, 詹志辉
-
华南理工大学 计算机科学与工程学院,广东 广州 510006
- Author(s):
-
LI Jianyu, ZHAN Zhihui
-
School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China
-
- 关键词:
-
特征选择; 大规模优化; 粒子群优化算法; 进化计算; 群体智能; 数据驱动; 自监督学习; 离散区域编码
- Keywords:
-
feature selection; large-scale optimization; particle swarm optimization; evolutionary computation; swarm intelligence; data-driven; self-supervised learning; discrete region encoding
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202206008
- 摘要:
-
大规模特征选择问题的求解通常面临两大挑战:一是真实标签不足,难以引导算法进行特征选择;二是搜索空间规模大,难以搜索到满意的高质量解。为此,提出了新型的面向大规模特征选择的自监督数据驱动粒子群优化算法。第一,提出了自监督数据驱动特征选择的新型算法框架,可不依赖于真实标签进行特征选择。第二,提出了基于离散区域编码的搜索策略,帮助算法在大规模搜索空间中找到更优解。第三,基于上述的框架和方法,提出了自监督数据驱动粒子群优化算法,实现对问题的求解。在大规模特征数据集上的实验结果显示,提出的算法与主流有监督算法表现相当,并比前沿无监督算法具有更高的特征选择效率。
- Abstract:
-
Large-scale feature selection problems usually face two challenges: 1) Real labels are insufficient for guiding the algorithm to select features, and 2) a large-scale search space encumbers the search for a satisfactory high-quality solution. To this end, in this paper, a novel self-supervised data-driven particle swarm optimization algorithm is proposed for large-scale feature selection, including three contributions. First, a novel algorithmic framework named self-supervised data-driven feature selection is proposed, which can perform the feature selection without real labels. Second, a discrete region encoding-based search strategy is proposed, which helps the algorithm to find better solutions in a large-scale search space. Third, based on the above framework and method, a self-supervised data-driven particle swarm optimization algorithm is proposed to solve the large-scale feature selection problem. Experimental results on datasets with large-scale features show that the proposed algorithm performs comparably to the mainstream supervised algorithms and has higher feature selection efficiency than state-of-the-art unsupervised algorithms.
备注/Memo
收稿日期:2022-06-06。
基金项目:国家重点研发计划项目(2019YFB2102102);国家自然科学基金面上项目资助(62176094).
作者简介:黎建宇,博士研究生,主要研究方向为人工智能、进化计算、群体智能、知识学习与数据驱动;詹志辉,教授,博士生导师,Elsevier中国高被引学者,主要研究方向为人工智能、进化计算、群体智能、云计算和大数据。荣获吴文俊人工智能优秀青年奖和IEEE计算智能学会杰出青年奖。目前已在国际期刊和国际会议发表(录用)论文共150余篇,其中IEEE Transactions系列的计算机领域顶尖国际期刊论文60余篇。论文近被国际同行引用超过一万次(Google Scholar),其中SCI引用超过5000次。11篇论文先后入选ESI高被引(全球影响力排名前百分之一)论文,包括1篇ESI热点(全球影响力排名前千分之一)论文
通讯作者:詹志辉.E-mail:zhanapollo@163.com
更新日期/Last Update:
1900-01-01