[1]史荧中,王士同,邓赵红,等.基于核心向量机的多任务概念漂移数据快速分类[J].智能系统学报,2018,13(06):935-945.[doi:10.11992/tis.201712019]
 SHI Yingzhong,WANG Shitong,DENG Zhaohong,et al.The core vector machine-based rapid classification of multi-task concept drift dataset[J].CAAI Transactions on Intelligent Systems,2018,13(06):935-945.[doi:10.11992/tis.201712019]
点击复制

基于核心向量机的多任务概念漂移数据快速分类(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第13卷
期数:
2018年06期
页码:
935-945
栏目:
出版日期:
2018-10-25

文章信息/Info

Title:
The core vector machine-based rapid classification of multi-task concept drift dataset
作者:
史荧中12 王士同1 邓赵红13 侯立功2 钱冬杰2
1. 江南大学 数字媒体学院, 江苏 无锡 214122;
2. 无锡职业技术学院 物联网学院, 江苏 无锡 214121;
3. 江苏省媒体设计与软件技术重点实验室(江南大学), 江苏 无锡 214122
Author(s):
SHI Yingzhong12 WANG Shitong1 DENG Zhaohong13 HOU Ligong2 QIAN Dongjie2
1. School of Digital Media, Jiangnan University, Wuxi 214122, China;
2. School of Internet of Things, Wuxi Institute of Technology, Wuxi 214121, China;
3. Jiangsu Key Laboratory of Media Design and Software Technology, Jiangnan University, Wuxi 214122, China
关键词:
多任务大规模数据集概念漂移核心向量机线性时间复杂度
Keywords:
multi-tasklarge-scale datasetconcept driftcore vector machineslinear time complexity
分类号:
TP181
DOI:
10.11992/tis.201712019
摘要:
通过协同求解多个概念漂移问题并充分挖掘相关概念漂移问题中蕴含的有效信息,共享矢量链支持向量机(shared vector chain supported vector machines,SVC-SVM)在面向多任务概念漂移分类时表现出良好性能。然而实际应用中的概念漂移问题通常有较大的数据容量,较高的计算代价限制了SVC-SVM方法的推广能力。针对这个弱点,借鉴核心向量机的近线性时间复杂度的优势,提出了适于多任务概念漂移大规模数据的共享矢量链核心向量机(shared vector chain core vector machines,SVC-CVM)。SVC-CVM具有渐近线性时间复杂度的算法特点,同时又继承了SVC-SVM方法协同求解多个概念漂移问题带来的良好性能,实验验证了该方法在多任务概念漂移大规模数据集上的有效性和快速性。
Abstract:
The shared vector chain-supported vector machine (SVC-SVM) can solve multiple concept drift problems as well as related problems, and it shows attractive performance in multi-task concept drift classification. However, in many practical scenarios, the concept drift dataset is usually large, and its high computational cost severely limits the generalization ability of the SVC-SVM. To overcome this shortcoming, a novel classifier termed shared vector chain-core vector machine (SVC-CVM) is proposed for large scale multi-task concept drift dataset, considering the asymptotic linear time complexity of the core vector machines. This classifier has the merit of asymptotic time complexity and inherits the good performance of SVC-SVM in solving multi-task concept drift problems. Furthermore, the effectiveness and rapidness of the proposed method is experimentally confirmed on large-scale multi-task concept drift datasets.

参考文献/References:

[1] HELMBOLD D P, LONG P M. Tracking drifting concepts by minimizing disagreements[J]. Machine learning, 1994, 14(1):27-45.
[2] BARTLETT P L, BEN-DAVID S, KULKARNI S R. Learning changing concepts by exploiting the structure of change[J]. Machine learning, 2000, 41(2):153-174.
[3] ZHOU Xiangyu, WANG Wenjun, YU Long. Traffic flow analysis and prediction based on GPS data of floating cars[C]//Proceedings of the 2012 International Conference on Information Technology and Software Engineering.[S.l.], 2013:497-508.
[4] KUWATA S, INABA Y, YOKOGAWA M, et al. Stream data analysis application for customer behavior with complex event processing[C]//IEICE Technical Committee Submission System Conference Paper’s Information.[S.l.], 2010, 110(1):13-18.
[5] VERGARA A, VEMBU S, AYHAN T, et al. Chemical gas sensor drift compensation using classifier ensembles[J]. Sensors and actuators B:chemical, 2012, 166-167:320-329.
[6] BARTLETT P L. Learning with a slowly changing distribution[C]//Proceedings of the Fifth Annual Workshop on Computational Learning Theory. Pittsburgh, Pennsylvania, USA, 1992:243-252.
[7] KLINKENBERG R, JOACHIMS T. Detecting concept drift with support vector machines[C]//Proceedings of the Seventeenth International Conference on Machine Learning. San Francisco, CA, USA, 2000:487-494.
[8] RUANO-ORDáS D, FDEZ-RIVEROLA F, MéNDEZAB J R. Concept drift in e-mail datasets:an empirical study with practical implications[J]. Information sciences, 2018, 428:120-135.
[9] C LANQUILLON. Enhancing test classification to improve information filtering[D]. Magdeburg, Germany:Faculty Comp Sci, Univ. Magdeburg, 2001.
[10] 文益民, 强保华, 范志刚. 概念漂移数据流分类研究综述[J]. 智能系统学报, 2013, 8(2):95-104 WEN Yimin, QIANG Baohua, FAN Zhigang. A survey of the classification of data streams with concept drift[J]. CAAI transactions on intelligent systems, 2013, 8(2):95-104
[11] ALIPPI C, ROVERI M. Just-in-time adaptive classifiers-part Ⅱ:designing the classifier[J]. IEEE transactions on neural networks, 2008, 19(12):2053-2064.
[12] EVGENIOU T, PONTIL M. Regularized multi——task learning[C]//Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, WA, USA, 2004:109-117.
[13] GRINBLAT G L, UZAL L C, CECCATTO H A, et al. Solving nonstationary classification problems with coupled support vector machines[J]. IEEE transactions on neural networks, 2011, 22(1):37-51.
[14] SHI Yingzhong, CHUNG F L K, WANG Shitong. An improved ta-svm method without matrix inversion and its fast implementation for nonstationary datasets[J]. IEEE transactions on neural networks and learning systems, 2015, 26(9):2005-2018.
[15] 史荧中, 邓赵红, 钱鹏江,等. 基于共享矢量链的多任务概念漂移分类方法[J]. 控制与决策, 2018, 33(7):1215-1222. SHI Yingzhong, DENG Zhaohong, QIAN Pengjiang, et al. Multi-task concept drift classification method based on shared vector chain[J]. Control and Decision, 2018, 33(7):1215-1222.
[16] PLATT J. Fast training of support vector machines using sequential minimal optimization[C]//Advances in Kernel Methods-Support Vector Learning. Cambridge, MA:MIT Press, 2000:185-208.
[17] TSANG I W, KWOK J T, CHEUNG P M. Core vector machines:fast SVM training on very large data sets[J]. Journal of Machine Learning Research, 2005, 6:363-392.
[18] TSANG I W H, KWOK J T Y, ZURADA J M. Generalized core vector machines[J]. IEEE transactions on neural networks, 2006, 17(5):1126-1140.
[19] B?DOIU M, CLARKSON K L. Optimal core-sets for balls[J]. Computational geometry, 2008, 40(1):14-22.

相似文献/References:

[1]申彦,朱玉全.CMP上基于数据集划分的K-means多核优化算法[J].智能系统学报,2015,10(04):607.[doi:10.3969/j.issn.1673-4785.201411036]
 SHEN Yan,ZHU Yuquan.An optimized algorithm of K-means based on data set partition on CMP systems[J].CAAI Transactions on Intelligent Systems,2015,10(06):607.[doi:10.3969/j.issn.1673-4785.201411036]
[2]李滔,王士同.适合大规模数据集的增量式模糊聚类算法[J].智能系统学报,2016,11(2):188.[doi:10.11992/tis.201507013]
 LI Tao,WANG Shitong.Incremental fuzzy (c+p)-means clustering for large data[J].CAAI Transactions on Intelligent Systems,2016,11(06):188.[doi:10.11992/tis.201507013]
[3]杨梦铎,栾咏红,刘文军,等.基于自编码器的特征迁移算法[J].智能系统学报,2017,12(06):894.[doi:10.11992/tis.201706037]
 YANG Mengduo,LUAN Yonghong,LIU Wenjun,et al.Feature transfer algorithm based on an auto-encoder[J].CAAI Transactions on Intelligent Systems,2017,12(06):894.[doi:10.11992/tis.201706037]

备注/Memo

备注/Memo:
收稿日期:2017-12-13。
基金项目:国家自然科学基金项目(61300151);江苏省杰出青年基金项目(BK20140001);江苏省高等教育教改研究课题(2017JSJG282);江苏省高校自然科学研究项目(18KJB520048).
作者简介:史荧中,男,1970年生,副教授,博士,主要研究方向为人工智能、模式识别。参与多项省级以上科研项目,发表学术论文10余篇;王士同,男,1964年生,教授,博士生导师,主要研究方向为人工智能、模式识别。发表学术论文近百篇,其中被SCI、EI检索50余篇;邓赵红,男,1981年生,教授,博士生导师,CCF高级会员,主要研究方向为人工智能与模式识别、智能计算、系统建模。
通讯作者:史荧中.E-mail:shiyz@wxit.edu.cn
更新日期/Last Update: 2018-12-25