字符串 ') and Issue_No=(select Issue_No from OA where Script_ID=@Script_ID) order by ID ' 后的引号不完整。 ') and Issue_No=(select Issue_No from OA where Script_ID=@Script_ID) order by ID ' 附近有语法错误。 基于粗糙集相对分类信息熵和粒子群优化的特征选择方法-《智能系统学报》

[1]翟俊海,刘博,张素芳.基于粗糙集相对分类信息熵和粒子群优化的特征选择方法[J].智能系统学报,2017,12(03):397-404.[doi:10.11992/tis.201705004]
 ZHAI Junhai,LIU Bo,ZHANG Sufang.A feature selection approach based on rough set relative classification information entropy and particle swarm optimization[J].CAAI Transactions on Intelligent Systems,2017,12(03):397-404.[doi:10.11992/tis.201705004]
点击复制

基于粗糙集相对分类信息熵和粒子群优化的特征选择方法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第12卷
期数:
2017年03期
页码:
397-404
栏目:
出版日期:
2017-06-25

文章信息/Info

Title:
A feature selection approach based on rough set relative classification information entropy and particle swarm optimization
作者:
翟俊海12 刘博3 张素芳4
1. 河北大学 河北省机器学习与计算智能重点实验室, 河北 保定 071002;
2. 浙江师范大学 数理与信息工程学院, 浙江 金华 321004;
3. 河北大学 计算机科学与技术学院, 河北 保定 071002;
4. 中国气象局 气象干部培训学院河北分院, 河北 保定 071000
Author(s):
ZHAI Junhai12 LIU Bo3 ZHANG Sufang4
1. Key Lab of Machine Learning and Computational Intelligence, Hebei University, Baoding 071002, China;
2. College of Mathematics, Physics and Information Engineering, Zhejiang Normal University, Jinhua 321004, China;
3. College of Computer Science and Technology, Hebei University, Baoding 071002, China;
4. Hebei Branch of Meteorological Cadres Training Institute, China Meteorological Administration, Baoding 071000, China
关键词:
数据挖掘特征选择数据预处理粗糙集决策表粒子群算法信息熵适应度函数
Keywords:
data miningfeature selectiondata preprocessingrough setdecision tableparticle swarm optimizationinformation entropyfitness function
分类号:
TP181
DOI:
10.11992/tis.201705004
摘要:
特征选择是指从初始特征全集中,依据既定规则筛选出特征子集的过程,是数据挖掘的重要预处理步骤。通过剔除冗余属性,以达到降低算法复杂度和提高算法性能的目的。针对离散值特征选择问题,提出了一种将粗糙集相对分类信息熵和粒子群算法相结合的特征选择方法,依托粒子群算法,以相对分类信息熵作为适应度函数,并与其他基于进化算法的特征选择方法进行了实验比较,实验结果表明本文提出的方法具有一定的优势。
Abstract:
Feature selection, an important step in data mining, is a process that selects a subset from an original feature set based on some criteria. Its purpose is to reduce the computational complexity of the learning algorithm and to improve the performance of data mining by removing irrelevant and redundant features. To deal with the problem of discrete values, a feature selection approach was proposed in this paper. It uses a particle swarm optimization algorithm to search the optimal feature subset. Further, it employs relative classification information entropy as a fitness function to measure the significance of the feature subset. Then, the proposed approach was compared with other evolutionary algorithm-based methods of feature selection. The experimental results confirm that the proposed approach outperforms genetic algorithm-based methods.

参考文献/References:

[1] GUYON I, GUNN S, NIKRAVESH M, et al. Feature extraction, foundations and applications[M]. Berlin: Springer, 2006.
[2] DASH M, LIU H. Feature selection for classification [J]. Intelligent data analysis, 1997, 1: 131-151.
[3] PAWLAK Z. Rough sets [J]. Internationa journal of information and computer sciences, 1982, 11: 341-356.
[4] 苗夺谦, 李道国. 粗糙集理论、算法与应用 [M]. 北京: 清华大学出版社, 2008.
[5] SWINIARSKI R W, SKOWRON A. Rough set methods in feature selection and recognition[J]. Pattern recognition letters, 2003, 24(6): 833-849.
[6] JENSEN R, SHEN Q. Fuzzy-rough sets for descriptive dimensionality reduction[C]//IEEE International Conference on Fuzzy Systems, 2002. Fuzz-IEEE. 2002:29-34.
[7] BHATT R B, GOPAL M. On fuzzy-rough sets approach to feature selection[J]. Pattern recognition letters, 2005, 26(7): 965-975.
[8] JENSEN R, PARTHALáIN N M. Towards scalable fuzzy rough feature selection[J]. Information sciences, 2015, 323(C): 1-15.
[9] QIAN Y H, LIANG J, PEDRYCZ W, et al. Positive approximation: an accelerator for attribute reduction in rough set theory[J]. Artificial intelligence, 2010, 174(9/10): 597-618.
[10] HU Q H, YU D R, LIU J F, et al. Neighborhood rough set based heterogeneous feature subset selection[J]. Information sciences, 2008, 178(18): 3577-3594.
[11] ALMUALLIM H, DIETTERICH T G. Learning boolean concepts in the presence of many irrelevant features[J]. Artificial intelligence, 1994, 69 (1/2): 279-305.
[12] DASH M, LIU H. Consistency-based search in feature selection[J]. Artificial intelligence 2003 (151):155-176.
[13] BATTITI R. Using mutual information for selecting features in supervised neural net learning[J]. IEEE transactions on neural networks, 1994, 5(4): 537-549.
[14] KWAK N, CHOI C H. Input feature selection by mutual information based on parzen window [J]. IEEE transactions on pattern analysis and machine intelligence, 2002, 24(12): 1667-1671.
[15] ESTEVEZ P A, TESMER M, PEREZ C A, et al. Normalized mutual information feature selection [J]. IEEE transactions on neural networks, 2009, 20(2): 189-201.
[16] SONG L, SMOLA A, GRETTON A, et al. Feature selection via dependence maximization [J]. Journal of machine learning research, 2012, 13:1393-1434.
[17] HU Q H, ZHU Pengfei, LIU Jinfu, et al. Feature selection via maximizing fuzzy dependency[J]. Fundamenta informaticae, 2010, 98: 167-181.
[18] KOHAVI R, JOHN G. Wrappers for feature subset selection[J]. Artificial intelligence, 1997, 97(1/2): 273-324.
[19] SINDHWANI V, RAKSHIT S, DEODHARE D, et al. Feature selection in MLPs and SVMs based on maximum output information[J]. IEEE transactions on neural networks, 2004, 15(4): 937-947.
[20] YANG Jianbo, SHEN Kaiquan, ONG Chongjin, et al. Feature selection for MLP neural network: the use of random permutation of probabilistic outputs[J]. IEEE transactions on neural networks, 2009, 20(12): 1911-1922.
[21] QUINLAN J R. Induction of decision trees [J]. Machine learning, 1986, 1: 81-106.
[22] BREIMAN L, FRIEDMAN J H, RICHARD A S, et al. Classification and regression trees[M]. Belmont, CA: wadsworth international group, 1984.
[23] SETIONO R, LIU H. Neural-network feature selector [J]. IEEE transactions on neural networks, 1997, 8(3): 654-662.
[24] SHEN Kaiquan, ONG Chongjin, LI Xiaoping, et al. Feature selection via sensitivity analysis of SVM probabilistic outputs[J]. Machine learning, 2008, 70: 1-20.
[25] PERKINS S, LACKER K, THEILER J. Grafting: fast, incremental feature selection by gradient descent in function space [J]. Journal of machine learning research, 2003 (3) : 1333-1356.
[26] KENNEDY J, EBERHART R. Particle swarm optimization [C]. IEEE International Conference on Neural Networks. Perth, Australia, 1995, 4: 1942-1948.
[27] EBERHART R C, SHI Y H, KENNEDY J. Swarm Intelligence[M]. Massachusetts: Morgan Kaufmann, 2001.
[28] EBERHART R C, KENNEDY J. A discrete binary version of the particle swarm algorithm [J].IEEE conference on systems, 1997, 5: 4104-4109.
[29] CHUANG L Y, CHANG H W, TU C J, et al. Improved binary PSO for feature selection using gene expression data[J]. Computational biology & chemistry, 2008, 32(1): 29-37.
[30] CHUANG L Y, TSAI S W, YANG C H. Improved binary particle swarm optimization using catfish effect for feature selection[J]. Expert systems with applications, 2011, 38(10): 12699-12707.
[31] WANG Xiangyang, YANG Jie, TENG Xiaolong, et al. Feature selection based on rough sets and particle swarm optimization[J]. Pattern recognition letters, 2007, 28(4): 459-471.
[32] CERVANTE L, XUE B, ZHANG M, et al. Binary particle swarm optimisation for feature selection: a filter based approach[J]. Evolutionary computation, 2012, 41: 1-8.
[33] LIU Quanjin, ZHAO Zhimin, LI Yingxin. Ensemble feature selection method based on neighborhood information and PSO algorithm[J]. Acta electronica sinica, 2016, 44(4): 995-1002.
[34] FONG S, WONG R, VASILAKOS A. Accelerated PSO swarm search feature selection for data stream mining big data[J]. IEEE transactions on services computing, 2016, 9(1): 33-45.
[35] 翟俊海, 刘博, 张素芳. 基于相对分类信息熵的进化特征选择算法[J]. 模式识别与人工智能, 2016, 29(8):682-690.ZHAI Junhai, LIU Bo, ZHANG Sufang. Feature selection via evolutionary computation based on relative classification information entropy[J]. Pattern recognition and artificial intelligence, 2016, 29(8): 682-690.
[36] SHI B Y, EBERHART R. A modified particle swarm optimizer[J]. IEEE world congress on computational intelligence, 1999, 6: 69-73.

相似文献/References:

[1]张继福,张素兰,胡立华.约束概念格及其构造方法[J].智能系统学报,2006,1(02):31.
 ZHANG Ji-fu,ZHANG Su-lan,HU Li-hua.Constrained concept lattice and its construction method[J].CAAI Transactions on Intelligent Systems,2006,1(03):31.
[2]孙正兴,张尧烨,李 彬.基于线性规划分类器的相关反馈技术[J].智能系统学报,2007,2(03):34.
 SUN Zheng-xing,ZHANG Yao-ye,LI Bin.Applying relevance feedback with a linear programming classifier[J].CAAI Transactions on Intelligent Systems,2007,2(03):34.
[3]王国胤,张清华,胡 军.粒计算研究综述[J].智能系统学报,2007,2(06):8.
 WANG Guo-yin,ZHANG Qing-hua,HU Jun.An overview of granular computing[J].CAAI Transactions on Intelligent Systems,2007,2(03):8.
[4]张志飞,苗夺谦.基于粗糙集的文本分类特征选择算法[J].智能系统学报,2009,4(05):453.[doi:10.3969/j.issn.1673-4785.2009.05.011]
 ZHANG Zhi-fei,MIAO Duo-qian.Feature selection for text categorization based on rough set[J].CAAI Transactions on Intelligent Systems,2009,4(03):453.[doi:10.3969/j.issn.1673-4785.2009.05.011]
[5]顾成杰,张顺颐,杜安源.结合粗糙集和禁忌搜索的网络流量特征选择[J].智能系统学报,2011,6(03):254.
 GU Chengjie,ZHANG Shunyi,DU Anyuan.Feature selection of network traffic using a rough set and tabu search[J].CAAI Transactions on Intelligent Systems,2011,6(03):254.
[6]何清.物联网与数据挖掘云服务[J].智能系统学报,2012,7(03):189.
 HE Qing.The Internet of things and the data mining cloud service[J].CAAI Transactions on Intelligent Systems,2012,7(03):189.
[7]孙倩茹,王文敏,刘宏.视频序列的人体运动描述方法综述[J].智能系统学报,2013,8(03):189.
 SUN Qianru,WANG Wenmin,LIU Hong.Study of human action representation in video sequences[J].CAAI Transactions on Intelligent Systems,2013,8(03):189.
[8]曹晋,张莉,李凡长.一种基于支持向量数据描述的特征选择算法[J].智能系统学报,2015,10(02):215.[doi:10.3969/j.issn.1673-4785.201405063]
 CAO Jin,ZHANG Li,LI Fanzhang.A noval support vector data description-based feature selection method[J].CAAI Transactions on Intelligent Systems,2015,10(03):215.[doi:10.3969/j.issn.1673-4785.201405063]
[9]李海林,郭韧,万校基.基于特征矩阵的多元时间序列最小距离度量方法[J].智能系统学报,2015,10(03):442.[doi:10.3969/j.issn.1673-4785.201405047]
 LI Hailin,GUO Ren,WAN Xiaoji.A minimum distance measurement method for amultivariate time series based on the feature matrix[J].CAAI Transactions on Intelligent Systems,2015,10(03):442.[doi:10.3969/j.issn.1673-4785.201405047]
[10]张佳骕,蒋亦樟,王士同.基于特征选择聚类方法的稀疏TSK模糊系统[J].智能系统学报,2015,10(04):583.[doi:10.3969/j.issn.1673-4785.201412001]
 ZHANG Jiasu,JIANG Yizhang,WANG Shitong.Sparse TSK fuzzy system based on feature selection clustering method[J].CAAI Transactions on Intelligent Systems,2015,10(03):583.[doi:10.3969/j.issn.1673-4785.201412001]

备注/Memo

备注/Memo:
收稿日期:2017-05-07。
基金项目:国家自然科学基金项目(71371063);河北省自然科学基金项目(F2017201026);浙江省计算机科学与技术重中之重学科(浙江师范大学)资助项目.
作者简介:翟俊海,男,1964年生,男,教授,中国人工智能学会粗糙集与软计算专业委员会委员,主要研究方向为机器学习。近几年主持或参与省部级以上项目10余项,获河北省自然科学三等奖1项,出版专著4部,发表论文70余篇;刘博,男,1989年生,硕士研究生,主要研究方向为机器学习;张素芳,女,1966年生,副教授,主要研究方向为机器学习。
通讯作者:翟俊海.E-mail:mczjh@126.com.
更新日期/Last Update: 2017-06-25