[1]唐荣,罗川,曹潜,等.不完备数据中面向特征值更新的增量特征选择方法[J].智能系统学报,2021,16(3):493-501.[doi:10.11992/tis.202006045]
TANG Rong,LUO Chuan,CAO Qian,et al.Incremental approach for feature selection in incomplete data while updating feature values[J].CAAI Transactions on Intelligent Systems,2021,16(3):493-501.[doi:10.11992/tis.202006045]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
16
期数:
2021年第3期
页码:
493-501
栏目:
学术论文—知识工程
出版日期:
2021-05-05
- Title:
-
Incremental approach for feature selection in incomplete data while updating feature values
- 作者:
-
唐荣, 罗川, 曹潜, 王思朝
-
四川大学 计算机学院,四川 成都 610065
- Author(s):
-
TANG Rong, LUO Chuan, CAO Qian, WANG Sizhao
-
College of Computer Science, Sichuan University, Chengdu 610065, China
-
- 关键词:
-
特征选择; 维度约简; 粗糙集; 信息熵; 不完备数据; 缺失值; 启发式搜索; 增量学习
- Keywords:
-
feature selection; dimensional reduction; rough set; information entropy; incomplete data; missing values; heuristic search; incremental learning
- 分类号:
-
TP18
- DOI:
-
10.11992/tis.202006045
- 摘要:
-
实际应用中,数据常常表现出不完备性和动态性的特点。针对动态不完备数据中的特征选择问题,提出了一种基于相容粗糙集模型和信息熵理论的增量式特征选择方法。首先,建立了不完备信息系统中特征值动态更新时论域上条件划分与决策分类的动态更新模式,分析了作为特征重要度评价准则的不完备相容信息熵的增量计算机制,并将该机制引入到启发式最优特征子集搜索过程中特征重要度的迭代计算,进一步设计了不完备数据中面向特征值动态更新的增量式特征选择算法。最后,在标准UCI数据集上从分类精度、决策性能和计算效率3个方面对文中所提出的增量算法的有效性和高效性进行了实验验证。
- Abstract:
-
In practical application, data often exhibits incomplete and dynamic characteristics. For the feature selection problem in dynamic incomplete data, an incremental feature selection method based on the tolerance rough set model and information entropy theory is proposed. First, the update patterns of conditional partition and decision classification are established based on the variation of feature values in incomplete information systems. The incremental computing mechanism of incomplete tolerance information entropy as the evaluation criterion of feature importance is built subsequently. Such an incremental mechanism is integrated into the iterative calculation of feature importance during the heuristic search of optimal feature subset, and an incremental feature selection algorithm for dynamic variation of feature values is developed. Finally, the effectiveness and efficiency of the proposed incremental algorithm are verified on several standard UCI datasets in terms of classification accuracy, decision performance, and computing efficiency.
更新日期/Last Update:
2021-06-25