[1]李京政,杨习贝,窦慧莉,等.重要度集成的属性约简方法研究[J].智能系统学报,2018,13(3):414-421.[doi:10.11992/tis.201706080]
LI Jingzheng,YANG Xibei,DOU Huili,et al.Research on ensemble significance based attribute reduction approach[J].CAAI Transactions on Intelligent Systems,2018,13(3):414-421.[doi:10.11992/tis.201706080]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
13
期数:
2018年第3期
页码:
414-421
栏目:
学术论文—人工智能基础
出版日期:
2018-05-05
- Title:
-
Research on ensemble significance based attribute reduction approach
- 作者:
-
李京政1, 杨习贝1,2, 窦慧莉1, 王平心3, 陈向坚1
-
1. 江苏科技大学 计算机学院, 江苏 镇江 212003;
2. 南京理工大学 经济管理学院, 江苏 南京 210094;
3. 江苏科技大学 数理学院, 江苏 镇江 212003
- Author(s):
-
LI Jingzheng1, YANG Xibei1,2, DOU Huili1, WANG Pingxin3, CHEN Xiangjian1
-
1. School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212003, China;
2. School of Economics and Management, Nanjing University of Science and Technology, Nanjing 210094, China;
3. School of Mathematics and Physics, Jiangsu University of Science and Technology, Zhenjiang 212003, China
-
- 关键词:
-
属性约简; 分类; 聚类; 数据扰动; 集成; 启发式算法; 邻域粗糙集; 稳定性
- Keywords:
-
attribute reduction; classification; clustering; data perturbation; ensemble; heuristic algorithm; neighborhood rough set; stability
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.201706080
- 摘要:
-
启发式算法在求解约简的过程中逐步加入重要度最高的属性,但其忽视了数据扰动将会直接引起重要度计算的波动问题,从而造成约简结果的不稳定。鉴于此,提出了一种基于集成属性重要度的启发式算法框架。首先,在原始数据上进行多重采样;然后,在每次循环过程中分别计算各个采样结果上的属性重要度并对这些重要度进行集成;最后,将集成重要度最大的属性加入到约简中去。利用邻域粗糙集方法进行的实验结果表明,基于集成重要度的属性约简算法不仅能够获取更加稳定的约简,而且利用所生成的约简能够得到一致性较高的分类结果。
- Abstract:
-
In the process of computing reduct using a heuristic algorithm, the attribute with the highest importance is gradually added in. However, this approach neglects the fluctuation of important calculations which is directly caused by data perturbation. Notably, such fluctuation may lead to an unstable reduct result. To eliminate such an anomaly, a framework consisting of a heuristic algorithm based on the importance of the ensemble attribute was proposed. In this approach, firstly, multiple sampling is executed for raw data; secondly, in each cycle, the importance of each attribute is computed on the basis of each sampling and the importance indices are integrated; finally, the attribute with the highest importance is added into the reduct. The experimental results obtained by utilizing the neighborhood rough set method show that the new approach not only obtains a more stable reduct, but also attains the classification results with high uniformity.
备注/Memo
收稿日期:2017-06-24。
基金项目:国家自然科学基金项目(61572242,61503160,61502211);江苏省高校哲学社会科学基金项目(2015SJD769);中国博士后科学基金项目(2014M550293).
作者简介:李京政,男,1993年生,硕士研究生,主要研究方向为粗糙集理论、机器学习;杨习贝,男,1980年生,副教授,博士后,主要研究方向为粗糙集理论、粒计算、机器学习。发表学术论文100余篇,被SCI检索50余篇,出版英文专著一部;窦慧莉,女,1980年生,助理研究员,主要研究方向为粒计算、智能信息处理。
通讯作者:杨习贝.E-mail:zhenjiangyangxibei@163.com.
更新日期/Last Update:
2018-06-25