[1]李京政,杨习贝,窦慧莉,等.重要度集成的属性约简方法研究[J].智能系统学报,2018,13(03):414-421.[doi:10.11992/tis.201706080]
 LI Jingzheng,YANG Xibei,DOU Huili,et al.Research on ensemble significance based attribute reduction approach[J].CAAI Transactions on Intelligent Systems,2018,13(03):414-421.[doi:10.11992/tis.201706080]
点击复制

重要度集成的属性约简方法研究(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第13卷
期数:
2018年03期
页码:
414-421
栏目:
出版日期:
2018-05-05

文章信息/Info

Title:
Research on ensemble significance based attribute reduction approach
作者:
李京政1 杨习贝12 窦慧莉1 王平心3 陈向坚1
1. 江苏科技大学 计算机学院, 江苏 镇江 212003;
2. 南京理工大学 经济管理学院, 江苏 南京 210094;
3. 江苏科技大学 数理学院, 江苏 镇江 212003
Author(s):
LI Jingzheng1 YANG Xibei12 DOU Huili1 WANG Pingxin3 CHEN Xiangjian1
1. School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212003, China;
2. School of Economics and Management, Nanjing University of Science and Technology, Nanjing 210094, China;
3. School of Mathematics and Physics, Jiangsu University of Science and Technology, Zhenjiang 212003, China
关键词:
属性约简分类聚类数据扰动集成启发式算法邻域粗糙集稳定性
Keywords:
attribute reductionclassificationclusteringdata perturbationensembleheuristic algorithmneighborhood rough setstability
分类号:
TP391
DOI:
10.11992/tis.201706080
摘要:
启发式算法在求解约简的过程中逐步加入重要度最高的属性,但其忽视了数据扰动将会直接引起重要度计算的波动问题,从而造成约简结果的不稳定。鉴于此,提出了一种基于集成属性重要度的启发式算法框架。首先,在原始数据上进行多重采样;然后,在每次循环过程中分别计算各个采样结果上的属性重要度并对这些重要度进行集成;最后,将集成重要度最大的属性加入到约简中去。利用邻域粗糙集方法进行的实验结果表明,基于集成重要度的属性约简算法不仅能够获取更加稳定的约简,而且利用所生成的约简能够得到一致性较高的分类结果。
Abstract:
In the process of computing reduct using a heuristic algorithm, the attribute with the highest importance is gradually added in. However, this approach neglects the fluctuation of important calculations which is directly caused by data perturbation. Notably, such fluctuation may lead to an unstable reduct result. To eliminate such an anomaly, a framework consisting of a heuristic algorithm based on the importance of the ensemble attribute was proposed. In this approach, firstly, multiple sampling is executed for raw data; secondly, in each cycle, the importance of each attribute is computed on the basis of each sampling and the importance indices are integrated; finally, the attribute with the highest importance is added into the reduct. The experimental results obtained by utilizing the neighborhood rough set method show that the new approach not only obtains a more stable reduct, but also attains the classification results with high uniformity.

参考文献/References:

[1] PAWLAK Z. Rough sets:theoretical aspects of reasoning about data[M]. Boston, Mass, USA:Kluwer Academic Publishers, 1991.
[2] PAWLAK Z. Rough sets[J]. International journal of computer & information sciences, 1982, 11(5):341-356.
[3] JU Hengrong, LI Huaxiong, YANG Xibei, et al. Cost-sensitive rough set:a multi-granulation approach[J]. Knowledge-based systems, 2017, 123:137-153, doi:10.1016/j.knosys.2017.02.019.
[4] XU Suping, YANG Xibei, YU Hualong, et al. Multi-label learning with label-specific feature reduction[J]. Knowledge-based systems, 2016, 104:52-61.
[5] DUBOIS D, PRADE H. Rough fuzzy sets and fuzzy rough sets[J]. International journal of general systems, 1990, 17(2/3):191-209.
[6] 胡清华, 于达仁, 谢宗霞. 基于邻域粒化和粗糙逼近的数值属性约简[J]. 软件学报, 2008, 19(3):640-649. HU Qinghua, YU Daren, XIE Zongxia. Numerical attribute reduction based on neighborhood granulation and rough approximation[J]. Journal of software, 2008, 19(3):640-649.
[7] YANG Xibei, CHEN Zehua, DOU Huili, et al. Neighborhood system based rough set:models and attribute reductions[J]. International journal of uncertainty, fuzziness and knowledge-based systems, 2012, 20(3):399-419.
[8] LIANG Jiye, WANG Feng, DANG Chuangyin, et al. An efficient rough feature selection algorithm with a multi-granulation view[J]. International journal of approximate reasoning, 2012, 53(6):912-926.
[9] ZHANG Xiao, MEI Changlin, CHEN Degang, et al. Feature selection in mixed data:a method using a novel fuzzy rough set-based information entropy[J]. Pattern recognition, 2016, 56:1-15.
[10] JU Hengrong, YANG Xibei, YU Hualong, et al. Cost-sensitive rough set approach[J]. Information sciences, 2016, 355-356:282-298.
[11] JIA Xiuyi, LIAO Wenhe, TANG Zhenmin, et al. Minimum cost attribute reduction in decision-theoretic rough set models[J]. Information sciences, 2013, 219:151-167.
[12] YANG Xibei, QI Yunsong, SONG Xiaoning, et al. Test cost sensitive multigranulation rough set:model and minimal cost selection[J]. Information sciences, 2013, 250:184-199.
[13] MIN Fan, HE Huaping, QIAN Yuhua, et al. Test-cost-sensitive attribute reduction[J]. Information sciences, 2011, 181(22):4928-4942.
[14] SONG Jingjing, TSANG E C C, CHEN Degang, et al. Minimal decision cost reduct in fuzzy decision-theoretic rough set model[J]. Knowledge-based systems, 2017, 126:104-112, doi:10.1016/j.knosys.2017.03.013.
[15] 王熙照, 王婷婷, 翟俊海. 基于样例选取的属性约简算法[J]. 计算机研究与发展, 2012, 49(11):2305-2310. WANG Xizhao, WANG Tingting, ZHAI Junhai. An attribute reduction algorithm based on instance selection[J]. Journal of computer research and development, 2012, 49(11):2305-2310.
[16] 杨习贝, 颜旭, 徐苏平, 等. 基于样本选择的启发式属性约简方法研究[J]. 计算机科学, 2016, 43(1):40-43. YANG Xibei, YAN Xu, XU Suping, et al. New heuristic attribute reduction algorithm based on sample selection[J]. Computer science, 2016, 43(1):40-43.
[17] LI Yun, SI J, ZHOU Guojing, et al. FREL:a stable feature selection algorithm[J]. IEEE transactions on neural networks and learning systems, 2014, 26(7):1388-1402.
[18] 周林, 平西建, 徐森, 等. 基于谱聚类的聚类集成算法[J]. 自动化学报, 2012, 38(8):1335-1342. ZHOU Lin, PING Xijian, XU Sen, et al. Cluster ensemble based on spectral clustering[J]. Acta automatica sinica, 2012, 38(8):1335-1342.
[19] YANG Xibei, ZHANG Ming, DOU Huili, et al. Neighborhood systems-based rough sets in incomplete information system[J]. Knowledge-based systems, 2011, 24(6):858-867.
[20] QIAN Yuhua, WANG Qi, CHENG Honghong, et al. Fuzzy-rough feature selection accelerator[J]. Fuzzy sets and systems, 2014, 258:61-78.
[21] LI Jingzheng, YANG Xibei, SONG Xiaoning, et al. Neighborhood attribute reduction:a multi-criterion approach[J]. International journal of machine learning and cybernetics, 2017, doi:10.1007/s13042-017-0758-5.
[22] DEM?AR J. Statistical comparisons of classifiers over multiple data sets[J]. Journal of machine learning research, 2006, 7:1-30.

相似文献/References:

[1]伞 冶,叶玉玲.粗糙集理论及其在智能系统中的应用[J].智能系统学报,2007,2(02):40.
 SAN Ye,YE Yu-ling.Rough set theory and its application in the intelligent systems[J].CAAI Transactions on Intelligent Systems,2007,2(03):40.
[2]刘三阳 杜喆.一种改进的模糊支持向量机算法[J].智能系统学报,2007,2(03):30.
 LIU San-yang,DU Zhe.An improved fuzzy support vector machine method[J].CAAI Transactions on Intelligent Systems,2007,2(03):30.
[3]富春岩,葛茂松.一种能够适应概念漂移变化的数据流分类方法[J].智能系统学报,2007,2(04):86.
 FU Chun-yan,GE Mao-song.A data stream classification methods adaptive to concept drift[J].CAAI Transactions on Intelligent Systems,2007,2(03):86.
[4]马胜蓝,叶东毅.一种带禁忌搜索的粒子并行子群最小约简算法[J].智能系统学报,2011,6(02):132.
 MA Shenglan,YE Dongyi.A minimum reduction algorithm based on parallel particle subswarm optimization with tabu search capability[J].CAAI Transactions on Intelligent Systems,2011,6(03):132.
[5]杨成东,邓廷权.综合属性选择和删除的属性约简方法[J].智能系统学报,2013,8(02):183.[doi:10.3969/j.issn.1673-4785.201209056]
 YANG Chengdong,DENG Tingquan.An approach to attribute reduction combining attribute selection and deletion[J].CAAI Transactions on Intelligent Systems,2013,8(03):183.[doi:10.3969/j.issn.1673-4785.201209056]
[6]王定桥,李卫华,杨春燕.从用户需求语句建立问题可拓模型的研究[J].智能系统学报,2015,10(6):865.[doi:10.11992/tis.201507038]
 WANG Dingqiao,LI Weihua,YANG Chunyan.Research on building an extension model from user requirements[J].CAAI Transactions on Intelligent Systems,2015,10(03):865.[doi:10.11992/tis.201507038]
[7]王晓初,包芳,王士同,等.基于最小最大概率机的迁移学习分类算法[J].智能系统学报,2016,11(1):84.[doi:10.11992/tis.201505024]
 WANG Xiaochu,BAO Fang,WANG Shitong,et al.Transfer learning classification algorithms based on minimax probability machine[J].CAAI Transactions on Intelligent Systems,2016,11(03):84.[doi:10.11992/tis.201505024]
[8]乔丽娟,徐章艳,谢小军,等.基于知识粒度的不完备决策表的属性约简算法[J].智能系统学报,2016,11(1):129.[doi:10.11992/tis.201506029]
 QIAO Lijuan,XU Zhangyan,XIE Xiaojun,et al.Efficient attribute reduction algorithm for an incomplete decision table based on knowledge granulation[J].CAAI Transactions on Intelligent Systems,2016,11(03):129.[doi:10.11992/tis.201506029]
[9]刘威,刘尚,周璇.BP神经网络子批量学习方法研究[J].智能系统学报,2016,11(2):226.[doi:10.11992/tis.201509015]
 LIU Wei,LIU Shang,ZHOU Xuan.Subbatch learning method for BP neural networks[J].CAAI Transactions on Intelligent Systems,2016,11(03):226.[doi:10.11992/tis.201509015]
[10]李海林,梁叶.分段聚合近似和数值导数的动态时间弯曲方法[J].智能系统学报,2016,11(2):249.[doi:10.11992/tis.201507064]
 LI Hailin,LIANG Ye.Dynamic time warping based on piecewise aggregate approximation and data derivatives[J].CAAI Transactions on Intelligent Systems,2016,11(03):249.[doi:10.11992/tis.201507064]

备注/Memo

备注/Memo:
收稿日期:2017-06-24。
基金项目:国家自然科学基金项目(61572242,61503160,61502211);江苏省高校哲学社会科学基金项目(2015SJD769);中国博士后科学基金项目(2014M550293).
作者简介:李京政,男,1993年生,硕士研究生,主要研究方向为粗糙集理论、机器学习;杨习贝,男,1980年生,副教授,博士后,主要研究方向为粗糙集理论、粒计算、机器学习。发表学术论文100余篇,被SCI检索50余篇,出版英文专著一部;窦慧莉,女,1980年生,助理研究员,主要研究方向为粒计算、智能信息处理。
通讯作者:杨习贝.E-mail:zhenjiangyangxibei@163.com.
更新日期/Last Update: 2018-06-25