[1]闵帆,王宏杰,刘福伦,等.SUCE:基于聚类集成的半监督二分类方法[J].智能系统学报,2018,13(6):974-980.[doi:10.11992/tis.201711027]
MIN Fan,WANG Hongjie,LIU Fulun,et al.SUCE: semi-supervised binary classification based on clustering ensemble[J].CAAI Transactions on Intelligent Systems,2018,13(6):974-980.[doi:10.11992/tis.201711027]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
13
期数:
2018年第6期
页码:
974-980
栏目:
学术论文—机器学习
出版日期:
2018-10-25
- Title:
-
SUCE: semi-supervised binary classification based on clustering ensemble
- 作者:
-
闵帆, 王宏杰, 刘福伦, 王轩
-
西南石油大学 计算机科学学院, 四川 成都 610500
- Author(s):
-
MIN Fan, WANG Hongjie, LIU Fulun, WANG Xuan
-
School of Computer Science, Southwest Petroleum University, Chengdu 610500, China
-
- 关键词:
-
集成学习; 聚类; 聚类集成; 半监督; 二分类
- Keywords:
-
ensemble learning; clustering; clustering ensemble; semi-supervised; binary classification
- 分类号:
-
TP181
- DOI:
-
10.11992/tis.201711027
- 摘要:
-
半监督学习和集成学习是目前机器学习领域中的重要方法。半监督学习利用未标记样本,而集成学习综合多个弱学习器,以提高分类精度。针对名词型数据,本文提出一种融合聚类和集成学习的半监督分类方法SUCE。在不同的参数设置下,采用多个聚类算法生成大量的弱学习器;利用已有的类标签信息,对弱学习器进行评价和选择;通过集成弱学习器对测试集进行预分类,并将置信度高的样本放入训练集;利用扩展的训练集,使用ID3、Nave Bayes、 kNN、C4.5、OneR、Logistic等基础算法对其他样本进行分类。在UCI数据集上的实验结果表明,当训练样本较少时,本方法能稳定提高多数基础算法的准确性。
- Abstract:
-
Semi-supervised learning and ensemble learning are important methods in the field of machine learning. Semi-supervised learning utilize unlabeled samples, while ensemble learning combines multiple weak learners to improve classification accuracy. This paper proposes a new method called Semi-sUpervised classification through Clustering and Ensemble learning (SUCE) for symbolic data. Under different parameter settings, a number of weak learners are generated using multiple clustering algorithms. Using existing class label information the weak learners are evaluated and selected. The test sets are pre-classified by weak learners ensemble. The samples with high confidence are moved to the training set, and the other samples are classified through the extended training set by using the basic algorithms such as ID3, Nave Bayes, kNN, C4.5, OneR, Logistic and so on. The experimental on the UCI datasets results show that SUCE can steadily improve the accuracy of most of the basic algorithms when there are fewer training samples.
备注/Memo
收稿日期:2017-11-21。
基金项目:国家自然科学基金项目(61379089).
作者简介:闵帆,男,1973年生,教授,博士生导师,主要研究方向为粒计算、代价敏感学习、推荐系统,主持国家自然科学基金1项。发表学术论文100余篇,被SCI检索30余篇;王宏杰,男,1992年生,硕士研究生,主要研究方向为粒计算、代价敏感学习。发表学术论文7篇,其中被EI检索1篇;刘福伦,男,1993年生,硕士研究生,主要研究方向为代价敏感学习、粗糙集。发表学术论文5篇,其中被SCI检索2篇,被EI检索1篇。
通讯作者:闵帆.E-mail:minfanphd@163.com
更新日期/Last Update:
2018-12-25