[1]高媛,陈向坚,王平心,等.面向一致性样本的属性约简[J].智能系统学报,2019,14(06):1170-1178.[doi:10.11992/tis.201905051]
 GAO Yuan,CHEN Xiangjian,WANG Pingxin,et al.Attribute reduction over consistent samples[J].CAAI Transactions on Intelligent Systems,2019,14(06):1170-1178.[doi:10.11992/tis.201905051]
点击复制

面向一致性样本的属性约简(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第14卷
期数:
2019年06期
页码:
1170-1178
栏目:
出版日期:
2019-11-05

文章信息/Info

Title:
Attribute reduction over consistent samples
作者:
高媛1 陈向坚1 王平心2 杨习贝1
1. 江苏科技大学 计算机学院, 江苏 镇江 212003;
2. 江苏科技大学 理学院, 江苏 镇江 212003
Author(s):
GAO Yuan1 CHEN Xiangjian1 WANG Pingxin2 YANG Xibei1
1. School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212003, China;
2. School of Science, Jiangsu University of Science and Technology, Zhenjiang 212003, China
关键词:
属性约简分类精度聚类一致性样本集成启发式算法邻域粗糙集多准则
Keywords:
attribute reductionclassification accuracyclusteringconsistent samplesensembleheuristic algorithmneighborhood rough setmultiple criteria
分类号:
TP181
DOI:
10.11992/tis.201905051
摘要:
作为粗糙集理论的一个核心内容,属性约简致力于根据给定的约束条件删除数据中的冗余属性。基于贪心策略的启发式算法是求解约简的一种有效手段,这一手段通常使用数据中的全部样本来度量属性的重要度从而进一步得到约简子集。但实际上,不同样本对于属性重要度计算的贡献是不同的,有些样本对重要度贡献不高甚至几乎没有贡献,且当数据中的样本数过大时,利用全部样本进行约简求解会使得时间消耗过大而难以接受。为了解决这一问题,提出了一种基于一致性样本的属性约简策略。具体算法大致由3个步骤组成,首先,将满足一致性原则的样本挑选出来;其次,将这些选中的样本组成新的决策系统;最后,利用启发式框架在新的决策系统中求解约简。实验结果表明:与基于聚类采样的属性约简算法相比,所提方法能够提供更高的分类精度。
Abstract:
As one of the key topics in rough sets theory, attribute reduction aims to remove redundant attributes in a data set according to a given constraint condition. Based on greedy strategy, the heuristic algorithm is an effective strategy in finding reductions. Traditional heuristic algorithms usually need to scan all samples in a data set to compute the significance of attributes to further obtain a reduction. However, different samples have different contributions to the process of computing significance. Some samples have little relation to the significance, and some even have no contribution to the significance. Therefore, scanning all samples to compute reductions may require too much time, and the time may be unacceptable if the number of samples is too large. To fill such a gap, we have proposed an attribute reduction algorithm with sample selection, which is based on the consistent principle. The algorithm is composed of three stages. First, the samples that satisfy the consistent principle were selected; second, a new decision system was constructed with these selected samples; finally, reductions were derived from the heuristic algorithm over the new decision system. Experimental results demonstrated that, compared with the attribute reduction algorithm with a cluster-based sample selection, our new algorithm can offer better classification accuracy.

参考文献/References:

[1] PAWLAK Z. Rough sets:Theoretical aspects of reasoning about data[M]. Dordrecht:Kluwer Academic Publishers, 1991.
[2] PAWLAK Z, GRZYMALA-BUSSE J, SLOWINSKI R, et al. Rough sets[J]. Communications of the ACM, 1995, 38(11):88-95.
[3] CHEN Hongmei, LI Tianrui, LUO Chuan, et al. A decision-theoretic rough set approach for dynamic data mining[J]. IEEE transactions on fuzzy systems, 2015, 23(6):1958-1970.
[4] WANG Changzhong, HU Qinghua, WANG Xizhao, et al. Feature selection based on neighborhood discrimination index[J]. IEEE transactions on neural networks and learning systems, 2018, 29(7):2986-2999.
[5] HU Qinghua, YU Daren, XIE Zongxia, et al. EROS:Ensemble rough subspaces[J]. Pattern recognition, 2007, 40(12):3728-3739.
[6] JIA Xiuyi, SHANG Lin, ZHOU Bing, et al. Generalized attribute reduct in rough set theory[J]. Knowledge-based systems, 2016, 91:204-218.
[7] YANG Xibei, QI Yunsong, SONG Xiaoning, et al. Test cost sensitive multigranulation rough set:model and minimal cost selection[J]. Information sciences, 2013, 250:184-199.
[8] CHEN Degang, YANG Yanyan, DONG Ze. An incremental algorithm for attribute reduction with variable precision rough sets[J]. Applied soft computing, 2016, 45:129-149.
[9] YANG Xibei, LIANG Shaochen, YU Hualong, et al. Pseudo-label neighborhood rough set:measures and attribute reductions[J]. International journal of approximate reasoning, 2019, 105:112-129.
[10] FERONE A. Feature selection based on composition of rough sets induced by feature granulation[J]. International journal of approximate reasoning, 2018, 101:276-292.
[11] HART P. The condensed nearest neighbor rule (Corresp.)[J]. IEEE transactions on information theor, 1968, 14(3):515-516.
[12] GATES G. The reduced nearest neighbor rule (Corresp.)[J]. IEEE transactions on information theory, 1972, 18(3):431-433.
[13] TOMEK I. Two modifications of CNN[J]. IEEE transactions on systems, man, and cybernetics, 1976, SMC-6(11):769-772.
[14] ANGIULLI F. Fast nearest neighbor condensation for large data sets classification[J]. IEEE transactions on knowledge and data engineering, 2007, 19(11):1450-1464.
[15] 王熙照, 王婷婷, 翟俊海. 基于样例选取的属性约简算法[J]. 计算机研究与发展, 2012, 49(11):2305-2310 WANG Xizhao, WANG Tingting, ZHAI Junhai. An attribute reduction algorithm based on instance selection[J]. Journal of computer research and development, 2012, 49(11):2305-2310
[16] ZHAI Junhai, WANG Xizhao, PANG Xiaohe. Voting-based instance selection from large data sets with mapreduce and random weight networks[J]. Information sciences, 2016, 367-368:1066-1077.
[17] ZHAI Junhai, LI Ta, WANG Xizhao. A cross-selection instance algorithm[J]. Journal of intelligent and fuzzy systems, 2016, 30(2):717-728.
[18] 杨习贝, 颜旭, 徐苏平, 等. 基于样本选择的启发式属性约简方法研究[J]. 计算机科学, 2016, 43(1):40-43 YANG Xibei, YAN Xu, XU Suping, et al. New heuristic attribute reduction algorithm based on sample selection[J]. Computer science, 2016, 43(1):40-43
[19] XU Suping, YANG Xibei, YU Hualong, et al. Multi-label learning with label-specific feature reduction[J]. Knowledge-based systems, 2016, 104:52-61.
[20] GAO Yuan, CHEN Xiangjian, YANG Xibei, et al. Neighborhood attribute reduction:a multicriterion strategy based on sample selection[J]. Information, 2018, 9(11):282.
[21] LIU Keyu, YANG Xibei, YU Hualong, et al. Rough set based semi-supervised feature selection via ensemble selector[J]. Knowledge-based systems, 2019, 165:282-296.
[22] DAI Jianhua, WANG Wentao, XU Qing, et al. Uncertainty measurement for interval-valued decision systems based on extended conditional entropy[J]. Knowledge-based systems, 2012, 27:443-450.
[23] ZHANG Xiao, MEI Changlin, CHEN Degang, et al. Feature selection in mixed data:a method using a novel fuzzy rough set-based information entropy[J]. Pattern recognition, 2016, 56:1-15.
[24] LIN Jianhua. Divergence measures based on the Shannon entropy[J]. IEEE transactions on information theory, 1991, 37(1):145-151.
[25] HU Qinghua, CHE Xunjian, ZHANG Lei, et al. Rank entropy-based decision trees for monotonic classification[J]. IEEE transactions on knowledge and data engineering, 2012, 24(11):2052-2064.
[26] YANG Xibei, YAO Yiyu. Ensemble selector for attribute reduction[J]. Applied soft computing, 2018, 70:1-11.
[27] LI Jingzheng, YANG Xibei, SONG Xiaoning, et al. Neighborhood attribute reduction:a multi-criterion approach[J]. International journal of machine learning and cybernetics, 2019, 10(4):731-742.
[28] HU Qinghua, YU Daren, XIE Zongxia. Neighborhood classifiers[J]. Expert systems with applications, 2008, 34(2):866-876.
[29] WANG Rui, LI Wei, LI Rui, et al. Automatic blur type classification via ensemble SVM[J]. Signal processing:image communication, 2019, 71:24-35.
[30] CHEN Degang, ZHAO Suyun. Local reduction of decision system with fuzzy rough sets[J]. Fuzzy sets and systems, 2010, 161(13):1871-1883.
[31] 孟军, 张晶, 姜丁菱, 等. 结合近邻传播聚类的选择性集成分类方法[J]. 计算机研究与发展, 2018, 55(5):986-993 MENG Jun, ZHANG Jing, JIANG Dingling, et al. Selective ensemble classification integrated with affinity propagation clustering[J]. Journal of computer research and development, 2018, 55(5):986-993

相似文献/References:

[1]伞 冶,叶玉玲.粗糙集理论及其在智能系统中的应用[J].智能系统学报,2007,2(02):40.
 SAN Ye,YE Yu-ling.Rough set theory and its application in the intelligent systems[J].CAAI Transactions on Intelligent Systems,2007,2(06):40.
[2]马胜蓝,叶东毅.一种带禁忌搜索的粒子并行子群最小约简算法[J].智能系统学报,2011,6(02):132.
 MA Shenglan,YE Dongyi.A minimum reduction algorithm based on parallel particle subswarm optimization with tabu search capability[J].CAAI Transactions on Intelligent Systems,2011,6(06):132.
[3]杨成东,邓廷权.综合属性选择和删除的属性约简方法[J].智能系统学报,2013,8(02):183.[doi:10.3969/j.issn.1673-4785.201209056]
 YANG Chengdong,DENG Tingquan.An approach to attribute reduction combining attribute selection and deletion[J].CAAI Transactions on Intelligent Systems,2013,8(06):183.[doi:10.3969/j.issn.1673-4785.201209056]
[4]乔丽娟,徐章艳,谢小军,等.基于知识粒度的不完备决策表的属性约简算法[J].智能系统学报,2016,11(1):129.[doi:10.11992/tis.201506029]
 QIAO Lijuan,XU Zhangyan,XIE Xiaojun,et al.Efficient attribute reduction algorithm for an incomplete decision table based on knowledge granulation[J].CAAI Transactions on Intelligent Systems,2016,11(06):129.[doi:10.11992/tis.201506029]
[5]鞠恒荣,马兴斌,杨习贝,等.不完备信息系统中测试代价敏感的可变精度分类粗糙集[J].智能系统学报,2014,9(02):219.[doi:10.3969/j.issn.1673-4785.201307010]
 JU Hengrong,MA Xingbin,YANG Xibei,et al.Test-cost-sensitive based variable precision classification rough set in incomplete information system[J].CAAI Transactions on Intelligent Systems,2014,9(06):219.[doi:10.3969/j.issn.1673-4785.201307010]
[6]韦碧鹏,吕跃进,李金海.α优势关系下粗糙集模型的属性约简[J].智能系统学报,2014,9(02):251.[doi:10.3969/j.issn.1673-4785.201307012]
 WEI Bipeng,LÜ,Yuejin,et al.Attribute reduction based on the rough set model under α dominance relation[J].CAAI Transactions on Intelligent Systems,2014,9(06):251.[doi:10.3969/j.issn.1673-4785.201307012]
[7]钱进,朱亚炎.面向成组对象集的增量式属性约简算法[J].智能系统学报,2016,11(4):496.[doi:10.11992/tis.201606005]
 QIAN Jin,ZHU Yayan.An incremental attribute reduction algorithm for group objects[J].CAAI Transactions on Intelligent Systems,2016,11(06):496.[doi:10.11992/tis.201606005]
[8]冯丹,黄洋,石云鹏,等.连续型数据的辨识矩阵属性约简方法[J].智能系统学报,2017,12(03):371.[doi:10.11992/tis.201704032]
 FENG Dan,HUANG Yang,SHI Yunpeng,et al.A discernibility matrix-based attribute reduction for continuous data[J].CAAI Transactions on Intelligent Systems,2017,12(06):371.[doi:10.11992/tis.201704032]
[9]高学义,张楠,童向荣,等.广义分布保持属性约简研究[J].智能系统学报,2017,12(03):377.[doi:10.11992/tis.201704025]
 GAO Xueyi,ZHANG Nan,TONG Xiangrong,et al.Research on attribute reduction using generalized distribution preservation[J].CAAI Transactions on Intelligent Systems,2017,12(06):377.[doi:10.11992/tis.201704025]
[10]胡敏杰,林耀进,杨红和,等.基于特征相关的谱特征选择算法[J].智能系统学报,2017,12(04):519.[doi:10.11992/tis.201609008]
 HU Minjie,LIN Yaojin,YANG Honghe,et al.Spectral feature selection based on feature correlation[J].CAAI Transactions on Intelligent Systems,2017,12(06):519.[doi:10.11992/tis.201609008]

备注/Memo

备注/Memo:
收稿日期:2019-05-27。
基金项目:国家自然科学基金项目(61572242,61503160);江苏省研究生科研创新计划项目(KYCX19_1697).
作者简介:高媛,女,1994年生,硕士研究生,主要研究方向为粗糙集理论、机器学习;陈向坚,女,1983年生,副教授,博士,主要研究方向为模糊神经网络与智能控制。主持国家自然科学基金项目1项,发表学术论文20余篇;王平心,男,1980年生,副教授,博士,主要研究方向为矩阵分析与粒计算。主持国家自然科学基金项目1项,发表学术论文30余篇。
通讯作者:杨习贝.E-mail:jsjxy_yxb@just.edu.cn
更新日期/Last Update: 2019-12-25