<-上一篇/Previous Article 下一篇/Next Article->

[1]周晶雨,王士同.对不平衡目标域的多源在线迁移学习[J].智能系统学报,2022,17(2):248-256.[doi:10.11992/tis.202012019]
　ZHOU Jingyu,WANG Shitong.Multi-source online transfer learning for imbalanced target domains[J].CAAI Transactions on Intelligent Systems,2022,17(2):248-256.[doi:10.11992/tis.202012019]

点击复制

对不平衡目标域的多源在线迁移学习

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 17 期数: 2022年第2期页码: 248-256 栏目: 学术论文—机器学习出版日期: 2022-03-05

Title:: Multi-source online transfer learning for imbalanced target domains

作者:: 周晶雨, 王士同; 江南大学人工智能与计算机学院，江苏无锡 214122

Author(s):: ZHOU Jingyu, WANG Shitong; School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China

关键词:: 多源迁移学习; 在线学习; 目标域; 不平衡数据; 过采样; k近邻; 输入空间; 特征空间

Keywords:: multi-source transfer learning; online learning; target domain; imbalanced data; oversampling; k-nearest neighbor; input space; feature space

分类号:: TP181

DOI:: 10.11992/tis.202012019

摘要:: 多源在线迁移学习已经广泛地应用于相关源域中含有大量的标记数据且目标域中数据以数据流的形式达到的应用中。然而，目标域的类别分布有时是不平衡的，针对目标域每次以在线方式到达多个数据的不平衡二分类问题，本文提出了一种可以对目标域样本过采样的多源在线迁移学习算法。该算法从前面批次的样本中寻找当前批次的样本的k近邻，先少量生成多数类样本，再生成少数类使得当前批次样本的类别分布平衡。每个批次合成样本和真实样本一同训练目标域函数，从而提升目标域函数的分类性能。同时，分别设计了在目标域的输入空间和特征空间过采样的方法，并且在多个真实世界数据集上进行了综合实验，证明了所提出算法的有效性。

Abstract:: Multi-source online transfer learning has been widely used in applications where the relevant source domain contains a large amount of labeled data and the data in the target domain is achieved in the form of data flow. However, the class distribution of the target domain is sometimes imbalanced. Aiming at the unbalanced binary classification problem wherein the target domain reaches multiple data online at a time, this paper proposes a multi-source online transfer learning algorithm by means of oversampling the target domain samples. First, the algorithm finds the k-nearest neighbors of the current batch of samples from the previous batch, then generates a small number of majority class samples, finally generating a minority class to balance the class distribution of the current batch of samples. Each batch of synthetic and real samples train the target domain function together, thereby improving the classification performance of the target domain function. At the same time, methods for oversampling in the input space and feature space of the target domain are designed respectively, and comprehensive experiments are conducted on multiple real-world data sets to prove the effectiveness of the proposed algorithm.

参考文献/References:: [1] PAN S J, YANG Qiang. A survey on transfer learning[J]. IEEE transactions on knowledge and data engineering, 2010, 22(10): 1345–1359.
[2] EATON E, DESJARDINS M. Selective transfer between learning tasks using task-based boosting[C]//Proceedings of the 25th AAAI Conference on Artificial Intelligence. San Francisco, USA, 2011.
[3] DREDZE M, KULESZA A, CRAMMER K. Multi-domain learning by confidence-weighted parameter combination[J]. Machine learning, 2010, 79(1/2): 123–149.
[4] QIAN Qi, ZHU Shenghuo, TANG Jiasheng, et al. Robust optimization over multiple domains[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Hawaii, USA, 2019: 4739-4746.
[5] HOFFMAN J, MOHRI M, ZHANG Ningshan. Algorithms and theory for multiple-source adaptation[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada, 2018.
[6] PENG Xingchao, BAI Qinxun, XIA Xide, et al. Moment matching for multi-source domain adaptation[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019.
[7] KANG Zhongfeng, YANG Bo, YANG Shantian, et al. Online transfer learning with multiple source domains for multi-class classification[J]. Knowledge-based systems, 2020, 190: 105149.
[8] XIANG E W, PAN S J, PAN Weike, et al. Source-selection-free transfer learning[C]//Proceedings of the 22nd International Joint Conference on Artificial Intelligence. Barcelona, Spain, 2011: 2355.
[9] GENTILE C. A new approximate maximal margin classification algorithm[J]. Journal of machine learning research, 2001, 2: 213–242.
[10] CRAMMER K, DREDZE M, PEREIRA F. Confidence-weighted linear classification for text categorization[J]. The journal of machine learning research, 2012, 13(1): 1891–1926.
[11] 王晓初, 包芳, 王士同, 等. 基于最小最大概率机的迁移学习分类算法[J]. 智能系统学报, 2016, 11(1): 84–92
WANG Xiaochu, BAO Fang, WANG Shitong, et al. Transfer learning classification algorithms based on minimax probability machine[J]. CAAI transactions on intelligent systems, 2016, 11(1): 84–92
[12] ZHAO Peilin, HOI S C H. OTL: a framework of online transfer learning[C]//Proceedings of the 27th International Conference on International Conference on Machine Learning. Haifa, Israel: Omnipress, 2010.
[13] WU Qingyao, WU Hanrui, ZHOU Xiaoming, et al. Online transfer learning with multiple homogeneous or heterogeneous sources[J]. IEEE transactions on knowledge and data engineering, 2017, 29(7): 1494–1507.
[14] CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of artificial intelligence research, 2002, 16: 321–357.
[15] 左鹏玉, 周洁, 王士同. 面对类别不平衡的增量在线序列极限学习机[J]. 智能系统学报, 2020, 15(3): 520–527
ZUO Pengyu, ZHOU Jie, WANG Shitong. Incremental online sequential extreme learning machine for imbalanced data[J]. CAAI transactions on intelligent systems, 2020, 15(3): 520–527
[16] MATHEW J, PANG C K, LUO Ming, et al. Classification of imbalanced data by oversampling in kernel space of support vector machines[J]. IEEE transactions on neural networks and learning systems, 2017, 29(9): 4065–4076.
[17] CRAMMER K, DEKEL O, KESHET J, et al. Online passive-aggressive algorithms[J]. The journal of machine learning research, 2006, 7: 551–585.
[18] VENKATESWARA H, EUSEBIO J, CHAKRABORTY S, et al. Deep hashing network for unsupervised domain adaptation[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017: 5385-5394.
[19] SAENKO K, KULIS B, FRITZ M, et al. Adapting visual category models to new domains[C]//Proceedings of the 11th European Conference on Computer Vision. Heraklion, Greece, 2010: 213-226.

相似文献/References:: [1]富春岩,葛茂松.一种能够适应概念漂移变化的数据流分类方法[J].智能系统学报,2007,2(4):86.
　FU Chun-yan,GE Mao-song.A data stream classification methods adaptive to concept drift[J].CAAI Transactions on Intelligent Systems,2007,2():86.
[2]易磊,潘志松,邱俊洋,等.在线学习的大规模网络流量分类研究[J].智能系统学报,2016,11(3):318.[doi:10.11992/tis.201603033]
　YI Lei,PAN Zhisong,QIU Junyang,et al.Large-scale network traffic classification based on online learning[J].CAAI Transactions on Intelligent Systems,2016,11():318.[doi:10.11992/tis.201603033]
[3]栾寻,高尉.优化AUC两遍学习算法[J].智能系统学报,2018,13(3):395.[doi:10.11992/tis.201706079]
　LUAN Xun,GAO Wei.Two-pass AUC optimization[J].CAAI Transactions on Intelligent Systems,2018,13():395.[doi:10.11992/tis.201706079]
[4]左鹏玉,周洁,王士同.面对类别不平衡的增量在线序列极限学习机[J].智能系统学报,2020,15(3):520.[doi:10.11992/tis.201904040]
　ZUO Pengyu,ZHOU Jie,WANG Shitong.Incremental online sequential extreme learning machine for imbalanced data[J].CAAI Transactions on Intelligent Systems,2020,15():520.[doi:10.11992/tis.201904040]

备注/Memo

收稿日期:2020-12-16。
基金项目:国家自然科学基金项目(61572236)
作者简介:周晶雨，硕士研究生，主要研究方向为人工智能、模式识别;王士同，教授，博士生导师，主要研究方向为人工智能与模式识别。发表学术论文近百篇
通讯作者:王士同.E-mail:wxwangst@aliyun.com

更新日期/Last Update: 1900-01-01

对不平衡目标域的多源在线迁移学习 PDF下载HTML

备注/Memo

对不平衡目标域的多源在线迁移学习

PDF下载 HTML