[1]冀常鹏,尚佳奇,代巍.不平衡数据集的DC-SMOTE过采样方法[J].智能系统学报,2024,19(3):525-533.[doi:10.11992/tis.202204013]
 JI Changpeng,SHANG Jiaqi,DAI Wei.DC-SMOTE oversampling method for an imbalanced dataset[J].CAAI Transactions on Intelligent Systems,2024,19(3):525-533.[doi:10.11992/tis.202204013]
点击复制

不平衡数据集的DC-SMOTE过采样方法

参考文献/References:
[1] LIU Lan, WANG Pengcheng, LIN Jun, et al. Intrusion detection of imbalanced network traffic based on machine learning and deep learning[J]. IEEE access, 2020, 9: 7550–7563.
[2] AWOYEMI J O, ADETUNMBI A O, OLUWADARE S A. Credit card fraud detection using machine learning techniques: a comparative analysis[C]//2017 International Conference on Computing Networking and Informatics (ICCNI). Lagos, Nigeria. IEEE, 2017: 1–9.
[3] ZHANG Jue, CHEN Li. Clustering-based undersampling with random over sampling examples and support vector machine for imbalanced classification of breast cancer diagnosis[J]. Computer assisted surgery, 2019, 24(sup2): 62–72.
[4] FOTOUHI S, ASADI S, KATTAN M W. A comprehensive data level analysis for cancer diagnosis on imbalanced data[J]. Journal of biomedical informatics, 2019, 90: 103089.
[5] MA Zhiqiang, YAN Rui, YUAN Donghong, et al. An imbalanced Spam mail filtering method[J]. International journal of multimedia and ubiquitous engineering, 2015, 10(3): 119–126.
[6] CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of artificial intelligence research, 2002, 16: 321–357.
[7] HAN Hui, WANG Wenyuan, MAO Binghuan. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning[M]//Lecture Notes in Computer Science. Berlin: Springer Berlin Heidelberg, 2005: 878-887.
[8] DOUZAS G, BACAO F, LAST F. Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE[J]. Information sciences: an international journal, 2018, 465(C): 1–20.
[9] 谢子鹏, 包崇明, 周丽华, 等. 类不平衡数据的EM聚类过采样算法[J]. 计算机科学与探索, 2023, 17(1): 228–237
XIE Zipeng, BAO Chongming, ZHOU Lihua, et al. EM clustering oversampling algorithm for class imbalanced data[J]. Journal of frontiers of computer science and technology, 2023, 17(1): 228–237
[10] 王亮, 冶继民. 整合DBSCAN和改进SMOTE的过采样算法[J]. 计算机工程与应用, 2020, 56(18): 111–118
WANG Liang, YE Jimin. Hybrid algorithm of DBSCAN and improved SMOTE for oversampling[J]. Computer engineering and applications, 2020, 56(18): 111–118
[11] ISLAM A, BELHAOUARI S B, REHMAN A U, et al. KNNOR: an oversampling technique for imbalanced datasets[J]. Applied soft computing, 2022, 115: 108288.
[12] TSAI C F, LIN Weichao, HU Yahan, et al. Under-sampling class imbalanced datasets by combining clustering analysis and instance selection[J]. Information sciences, 2019, 477: 47–54.
[13] 崔彩霞, 曹付元, 梁吉业. 基于密度峰值聚类的自适应欠采样方法[J]. 模式识别与人工智能, 2020, 33(9): 811–819
CUI Caixia, CAO Fuyuan, LIANG Jiye. Adaptive undersampling based on density peak clustering[J]. Pattern recognition and artificial intelligence, 2020, 33(9): 811–819
[14] DAS B, KRISHNAN N C, COOK D J. RACOG and wRACOG: two probabilistic oversampling techniques[J]. IEEE transactions on knowledge and data engineering, 2015, 27(1): 222–234.
[15] YU Hualong, NI Jun, ZHAO Jing. ACOSampling: an ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data[J]. Neurocomputing, 2013, 101: 309–318.
[16] TAO Xinmin, LI Qing, GUO Wenjie, et al. Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification[J]. Information sciences:an international journal, 2019, 487(C): 31–56.
[17] HE Hongliang, ZHANG Wenyu, ZHANG Shuai. A novel ensemble method for credit scoring: Adaption of different imbalance ratios[J]. Expert systems with applications, 2018, 98(8): 105–117.
[18] 平瑞, 周水生, 李冬. 高度不平衡数据的代价敏感随机森林分类算法[J]. 模式识别与人工智能, 2020, 33(3): 249–257
PING Rui, ZHOU Shuisheng, LI Dong. Cost sensitive random forest classification algorithm for highly unbalanced data[J]. Pattern recognition and artificial intelligence, 2020, 33(3): 249–257
[19] JO T, JAPKOWICZ N. Class imbalances versus small disjuncts[J]. ACM SIGKDD explorations newsletter, 2004, 6(1): 40–49.
[20] RODRIGUEZ A, LAIO A. Clustering by fast search and find of density peaks[J]. Science, 2014, 344(6191): 1492–1496.
[21] DU Mingjing, DING Shifei, JIA Hongjie. Study on density peaks clustering based on k-nearest neighbors and principal component analysis[J]. Knowledge-based systems, 2016, 99(9): 135–145.
[22] WANG Zhiqiang, YU Zhiwen, CHEN C L P, et al. Clustering by local gravitation[J]. IEEE transactions on cybernetics, 2018, 48(5): 1383–1396.
[23] JIANG Jianhua, HAO Dehao, CHEN Yujun, et al. GDPC: gravitation-based Density Peaks Clustering algorithm[J]. Physica A statistical mechanics and its applications, 2018, 502: 345–355.
[24] HE Haibo, BAI Yang, GARCIA E A, et al. ADASYN: adaptive synthetic sampling approach for imbalanced learning[C]//2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence). Hong Kong: IEEE, 2008: 1322-1328.
[25] KOZIARSKI M, WO?NIAK M. CCR: a combined cleaning and resampling algorithm for imbalanced data classification[J]. International journal of applied mathematics and computer science, 2017, 27(4): 727–736.
[26] PIRYONESI S M, EL-DIRABY T E. Data analytics in asset management: cost-effective prediction of the pavement condition index[J]. Journal of infrastructure systems, 2020, 26(1): 4019036.
相似文献/References:
[1]黄庆康,宋恺涛,陆建峰.应用于不平衡多分类问题的损失平衡函数[J].智能系统学报,2019,14(5):953.[doi:10.11992/tis.201808004]
 HUANG Qingkang,SONG Kaitao,LU Jianfeng.Application of the loss balance function to the imbalanced multi-classification problems[J].CAAI Transactions on Intelligent Systems,2019,14():953.[doi:10.11992/tis.201808004]
[2]石洪波,陈雨文,陈鑫.SMOTE过采样及其改进算法研究综述[J].智能系统学报,2019,14(6):1073.[doi:10.11992/tis.201906052]
 SHI Hongbo,CHEN Yuwen,CHEN Xin.Summary of research on SMOTE oversampling and its improved algorithms[J].CAAI Transactions on Intelligent Systems,2019,14():1073.[doi:10.11992/tis.201906052]
[3]刘金平,周嘉铭,贺俊宾,等.面向不均衡数据的融合谱聚类的自适应过采样法[J].智能系统学报,2020,15(4):732.[doi:10.11992/tis.201909062]
 LIU Jinping,ZHOU Jiaming,HE Junbin,et al.Spectral clustering-fused adaptive synthetic oversampling approach for imbalanced data processing[J].CAAI Transactions on Intelligent Systems,2020,15():732.[doi:10.11992/tis.201909062]
[4]周晶雨,王士同.对不平衡目标域的多源在线迁移学习[J].智能系统学报,2022,17(2):248.[doi:10.11992/tis.202012019]
 ZHOU Jingyu,WANG Shitong.Multi-source online transfer learning for imbalanced target domains[J].CAAI Transactions on Intelligent Systems,2022,17():248.[doi:10.11992/tis.202012019]
[5]李倩玉,王蓓,金晶,等.基于双向LSTM卷积网络与注意力机制的自动睡眠分期模型[J].智能系统学报,2022,17(3):523.[doi:10.11992/tis.202103013]
 LI Qianyu,WANG Bei,JIN Jing,et al.Automatic sleep staging model based on the bi-directional LSTM convolutional network and attention mechanism[J].CAAI Transactions on Intelligent Systems,2022,17():523.[doi:10.11992/tis.202103013]
[6]郭茂祖,王偲佳,王鹏跃,等.基于卫星图的小样本街区品质评估[J].智能系统学报,2022,17(6):1254.[doi:10.11992/tis.202111049]
 GUO Maozu,WANG Sijia,WANG Pengyue,et al.Small sample block quality evaluation based on satellite images[J].CAAI Transactions on Intelligent Systems,2022,17():1254.[doi:10.11992/tis.202111049]

备注/Memo

收稿日期:2022-04-10。
作者简介:冀常鹏,教授,主要研究方向为信号检测与估计、智能控制、工程机械电液一体化、无线传感网络和计算机仿真。主持或参与完成科研项目40余项,获得辽宁省科技进步一等奖1项,阜新市科技进步一等奖3项,二等奖2项,获得国家发明专利6项,实用新型专利16项。发表学术论文100余篇。 E-mail:ccp@lntu.edu.cn;尚佳奇,硕士研究生,主要研究方向为机器学习、数据挖掘。 E-mail:409516478@qq.com;代巍,讲师,博士,主要研究方向为微弱信号检测、信息处理,获得国家发明专利1项,软件著作权4项,发表学术论文10余篇。 E-mail:daiwei0084@126.com
通讯作者:冀常鹏. E-mail:ccp@lntu.edu.cn

更新日期/Last Update: 1900-01-01
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com