[1]刘杨磊,梁吉业,高嘉伟,等.基于Tri-training的半监督多标记学习算法[J].智能系统学报,2013,8(05):439-445.[doi:10.3969/j.issn.1673-4785.201305033]
 LIU Yanglei,LIANG Jiye,GAO Jiawei,et al.Semi-supervised multi-label learning algorithm based on Tri-training[J].CAAI Transactions on Intelligent Systems,2013,8(05):439-445.[doi:10.3969/j.issn.1673-4785.201305033]
点击复制

基于Tri-training的半监督多标记学习算法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第8卷
期数:
2013年05期
页码:
439-445
栏目:
出版日期:
2013-10-25

文章信息/Info

Title:
Semi-supervised multi-label learning algorithm based on Tri-training
文章编号:
1673-4785(2013)05-439-07
作者:
刘杨磊12梁吉业12高嘉伟12杨静12
1.山西大学 计算机与信息技术学院,山西 太原 030006; 2.山西大学 计算智能与中文信息处理教育部重点实验室,山西 太原 030006
Author(s):
LIU Yanglei12 LIANG Jiye12 GAO Jiawei12 YANG Jing12
1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China; 2. Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China
关键词:
多标记学习半监督学习Tri-training
Keywords:
multi-label learning semi-supervised learning Tri-training
分类号:
TP181
DOI:
10.3969/j.issn.1673-4785.201305033
文献标志码:
A
摘要:
传统的多标记学习是监督意义下的学习,它要求获得完整的类别标记.但是当数据规模较大且类别数目较多时,获得完整类别标记的训练样本集是非常困难的.因而,在半监督协同训练思想的框架下,提出了基于Tri-training的半监督多标记学习算法(SMLT).在学习阶段,SMLT引入一个虚拟类标记,然后针对每一对类别标记,利用协同训练机制Tri-training算法训练得到对应的分类器;在预测阶段,给定一个新的样本,将其代入上述所得的分类器中,根据类别标记得票数的多少将多标记学习问题转化为标记排序问题,并将虚拟类标记的得票数作为阈值对标记排序结果进行划分.在UCI中4个常用的多标记数据集上的对比实验表明,SMLT算法在4个评价指标上的性能大多优于其他对比算法,验证了该算法的有效性.
Abstract:
Traditional multi-label learning is in the sense of supervision, in which the complete category labels are required. However, when the size of data is large and there are several categories of labels, it is quite difficult to obtain the training sample sets with complete labels. Therefore, a semi-supervised multi-label learning algorithm based on Tri-training (SMLT) is proposed. In the learning stage, SMLT initially introduces a virtual label, then for each pair of virtual labels, the Tri-training algorithm is utilized to train the corresponding classifiers for each pair of labels. In the forecast stage, a new sample is given, which will be substituted into the obtained classifier described above. According to the votes of each label, the multi-label learning problem is transformed into a label ranking problem, subsequently; the votes of the virtual label are taken as the threshold for distinguishing the label ranking results. The contrast experiments on four commonly used UCI multi-label datasets show the SMLT algorithm behaves better than other comparative algorithms in four evaluation indices and the effectiveness of the proposed algorithm is verified.

参考文献/References:

[1]TSOUMAKAS G, KATAKIS I. Multi-label classification: an overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3): 1-13.
[2]ZHU Xiaojin. Semi-supervised learning literature survey [R]. Madison, USA: University of WisconsinMadison, 2008.
[3]常瑜,梁吉业,高嘉伟,等.一种基于Seeds集和成对约束的半监督聚类算法[J].南京大学学报:自然科学版, 2012, 48(4): 405-411.
        CHANG Yu, LIANG Jiye, GAO Jiawei, et al. A semi-supervised clustering algorithm based on seeds and pair wise constraints[J]. Journal of Nanjing University: Natural Sciences, 2012, 48(4): 405-411.
[4]ZHOU Zhihua, ZHANG Minling, HUANG Shengjun, et al. Multi-instance multi-label learning[J]. Artificial Intelligence, 2012, 176(1): 2291-2320.
[5]ZHANG Minling, ZHANG Kun. Multi-label learning by exploiting label dependency[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, DC, USA,2010: 999-1007.
[6]BOUTELL M R, LUO Jiebo, SHEN Xipeng, et al. Learning multi-label scene classification[J]. Pattern Recognition, 2004, 37(9): 1757-1771.
[7]FURNKRANZ J, HULLERMEIER E, MENCIA E L, et al. Multi-label classification via calibrated label ranking[J]. Machine Learning, 2008, 73(2): 133-153.
[8]TSOUMAKAS G, VLAHAVAS I. Random k-labelsets: an ensemble method for multilabel classification[C]//Proceedings of the 18th European Conference on Machine Learning. Berlin: Springer, 2007: 406-417.
[9]ZHANG Minling, ZHOU Zhihua. ML-kNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048.
[10]ELISSEEFF A, WESTON J. A kernel method for multi-labelled classification[M]//DIETTERICH T G, BECKER S, GHAHRAMANI Z. Advances in Neural Information Processing Systems 14. Cambridge, USA: The MIT Press, 2002: 681-687.
[11]LIU Yi, JIN Rong, YANG Liu. Semi-supervised multi-label learning by constrained non-negative matrix factorization[C]//Proceedings of the 21st National Conference on Artificial Intelligence. Menlo Park, USA, 2006: 421-426.
[12]姜远,佘俏俏,黎铭,等.一种直推式多标记文档分类方法[J].计算机研究与发展, 2008, 45(11): 1817-1823.
        JIANG Yuan, SHE Qiaoqiao, LI Ming, et al. A transductive multi-label text categorization approach[J]. Journal of Computer Research and Development, 2008, 45(11): 1817-1823.
[13]CHEN Gang, SONG Yangqiu, WANG Fei, et al. Semi-supervised multi-label learning by solving a Sylvester equation[C]//Proceedings of SIAM International Conference on Data Mining. Los Alamitos, USA, 2008: 410-419.
[14]SUN Yuyin, ZHANG Yin, ZHOU Zhihua. Multi-label learning with weak label[C]//Proceedings of the 24th AAAI Conference on Artificial Intelligence. Menlo Park, USA,2010: 593-598.
[15]孔祥南,黎铭,姜远,等.一种针对弱标记的直推式多标记分类方法[J].计算机研究与发展, 2010, 47(8): 1392-1399.
         KONG Xiangnan, LI Ming, JIANG Yuan, et al. A transductive multi-label classification method for weak labeling[J]. Journal of Computer Research and Development, 2010, 47(8): 1392-1399.
[16]KONG Xiangnan, NG M K, ZHOU Zhihua. Transductive multilabel learning via label set propagation[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(3): 704-719.
[17]李宇峰,黄圣君,周志华.一种基于正则化的半监督多标记学习方法[J].计算机研究与发展, 2012, 49(6): 1272-1278.
        LI Yufeng, HUANG Shengjun, ZHOU Zhihua. Regularized semi-supervised multilabel learning[J]. Journal of Computer Research and Development, 2012, 49(6): 1272-1278.
[18]周志华,王珏.机器学习及其应用[M].北京:清华大学出版社, 2007: 259-275.
[19]ZHOU Zhihua, LI Ming. Tri-training: exploiting unlabeled data using three classifiers[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529-1541.
[20]Multi-label datasets[EB/OL]. [2013-01-06]. http://sourceforge.net/projects/mulan/files/datasets/.

相似文献/References:

[1]李建元,周脚根,关佶红,等.谱图聚类算法研究进展[J].智能系统学报,2011,6(05):405.
 LI Jianyuan,ZHOU Jiaogen,GUAN Jihong,et al.A survey of clustering algorithms based on spectra of graphs[J].CAAI Transactions on Intelligent Systems,2011,6(05):405.
[2]蒋新华,高晟,廖律超,等.半监督SVM分类算法的交通视频车辆检测方法[J].智能系统学报,2015,10(5):690.[doi:10.11992/tis.201406044]
 JIANG Xinhua,GAO Sheng,LIAO Ljuchao,et al.Traffic video vehicle detection based on semi-supervised SVM classification algorithm[J].CAAI Transactions on Intelligent Systems,2015,10(05):690.[doi:10.11992/tis.201406044]
[3]张钢,谢晓珊,黄英,等.面向大数据流的半监督在线多核学习算法[J].智能系统学报,2014,9(03):355.[doi:10.3969/j.issn.1673-4785.201403067]
 ZHANG Gang,XIE Xiaoshan,HUANG Ying,et al.An online multi-kernel learning algorithm for big data[J].CAAI Transactions on Intelligent Systems,2014,9(05):355.[doi:10.3969/j.issn.1673-4785.201403067]
[4]杨文元.多标记学习自编码网络无监督维数约简[J].智能系统学报,2018,13(05):808.[doi:10.11992/tis.201804051]
 YANG Wenyuan.Unsupervised dimensionality reduction of multi-label learning via autoencoder networks[J].CAAI Transactions on Intelligent Systems,2018,13(05):808.[doi:10.11992/tis.201804051]
[5]赵晓晓,周治平.结合稀疏表示与约束传递的半监督谱聚类算法[J].智能系统学报,2018,13(05):855.[doi:10.11992/tis.201703013]
 ZHAO Xiaoxiao,ZHOU Zhiping.A semi-supervised spectral clustering algorithm combined with sparse representation and constraint propagation[J].CAAI Transactions on Intelligent Systems,2018,13(05):855.[doi:10.11992/tis.201703013]
[6]余鹰,王乐为,吴新念,等.基于改进卷积神经网络的多标记分类算法[J].智能系统学报,2019,14(03):566.[doi:10.11992/tis.201804056]
 YU Ying,WANG Lewei,WU Xinnian,et al.A multi-label classification algorithm based on an improved convolutional neural network[J].CAAI Transactions on Intelligent Systems,2019,14(05):566.[doi:10.11992/tis.201804056]
[7]严菲,王晓栋.鲁棒的半监督多标签特征选择方法[J].智能系统学报,2019,14(04):812.[doi:10.11992/tis.201809017]
 YAN Fei,WANG Xiaodong.A robust, semi-supervised, and multi-label feature selection method[J].CAAI Transactions on Intelligent Systems,2019,14(05):812.[doi:10.11992/tis.201809017]
[8]王一宾,裴根生,程玉胜.弹性网络核极限学习机的多标记学习算法[J].智能系统学报,2019,14(04):831.[doi:10.11992/tis.201806005]
 WANG Yibin,PEI Gensheng,CHENG Yusheng.Multi-label learning algorithm of an elastic net kernel extreme learning machine[J].CAAI Transactions on Intelligent Systems,2019,14(05):831.[doi:10.11992/tis.201806005]
[9]黄琴,钱文彬,王映龙,等.代价敏感数据的多标记特征选择算法[J].智能系统学报,2019,14(05):929.[doi:10.11992/tis.201807027]
 HUANG Qin,QIAN Wenbin,WANG Yinglong,et al.Multi-label feature selection algorithm for cost-sensitive data[J].CAAI Transactions on Intelligent Systems,2019,14(05):929.[doi:10.11992/tis.201807027]

备注/Memo

备注/Memo:
收稿日期:2013-05-09.     网络出版日期:2013-09-29. 
基金项目:国家“973”计划前期研究专项(2011CB311805);山西省科技攻关计划资助项目(20110321027-01);山西省科技基础条件平台建设项目(2012091002-0101).
通信作者:梁吉业. E-mail: ljy@sxu.edu.cn.
作者简介:
刘杨磊,男,1990年生,硕士研究生,主要研究方向为机器学习.发表学术论文3篇,获得计算机软件著作权登记3项.
梁吉业,男,1962年生,教授,博士生导师,博士,主要研究方向为机器学习、计算智能、数据挖掘等.先后主持国家自然科学基金重点项目1项、国家“863”计划项目2项,国家“973”计划前期研究专项1项、国家自然科学基金项目4项.发表学术论文150余篇,出版著作2部,获发明专利8项.
高嘉伟,男,1980年生,讲师,主要研究方向为机器学习.参与国家“863”计划项目1项、国家自然科学基金项目3项和山西省自然科学基金项目4项,发表学术论文10余篇.
更新日期/Last Update: 2013-11-28