[1]王俊红,段冰倩.一种基于密度的SMOTE方法研究[J].智能系统学报,2017,12(6):865-872.[doi:10.11992/tis.201706049]
WANG Junhong,DUAN Bingqian.Research on the SMOTE method based on density[J].CAAI Transactions on Intelligent Systems,2017,12(6):865-872.[doi:10.11992/tis.201706049]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
12
期数:
2017年第6期
页码:
865-872
栏目:
学术论文—机器学习
出版日期:
2017-12-25
- Title:
-
Research on the SMOTE method based on density
- 作者:
-
王俊红, 段冰倩
-
山西大学 计算机与信息技术学院, 山西 太原 030006
- Author(s):
-
WANG Junhong, DUAN Bingqian
-
School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
-
- 关键词:
-
非平衡; 分类; 采样; 准确率; 密度
- Keywords:
-
imbalance; classification; sampling; precision; density
- 分类号:
-
TP311
- DOI:
-
10.11992/tis.201706049
- 摘要:
-
重采样技术在解决非平衡类分类问题上得到了广泛的应用。其中,Chawla提出的SMOTE(Synthetic Minority Oversampling Technique)算法在一定程度上缓解了数据的不平衡程度,但这种方法对少数类数据不加区分地进行过抽样,容易造成过拟合。针对此问题,本文提出了一种新的过采样方法:DS-SMOTE方法。DS-SMOTE算法基于样本的密度来识别稀疏样本,并将其作为采样过程中的种子样本;然后在采样过程中采用SMOTE算法的思想,在种子样本与其k近邻之间产生合成样本。实验结果显示,DS-SMOTE算法与其他同类方法相比,准确率以及G值有较大的提高,说明DS-SMOTE算法在处理非平衡数据分类问题上具有一定优势。
- Abstract:
-
In recent years, over-sampling has been widely used in the field of classification of imbalanced classes. The SMOTE(Synthetic Minority Oversampling Technique) algorithm, presented by Chawla, alleviates the degree of data imbalance to a certain extent, but can lead to over-fitting. To solve this problem, this paper presents a new sampling method, DS-SMOTE, which identifies sparse samples based on their density and uses them as seed samples in the process of sampling. The SMOTE algorithm is then adopted, and a synthetic sample is generated between the seed sample and its k neighbor. The proposed algorithm showed great improvement in precision and G-mean compared with similar algorithms, and it has advantage of treating imbalanced data classification.
备注/Memo
收稿日期:2017-06-12;改回日期:。
基金项目:国家自然科学基金项目(61772323,61402272);山西省自然科学基金项目(201701D121051).
作者简介:王俊红女,1979年生,副教授,博士,主要研究方向为形式概念分析、粗糙集与粒计算以及数据挖掘;段冰倩,女,1991年生,硕士研究生,主要研究方向为数据挖掘。
通讯作者:王俊红.E-mail:wjhwjh@sxu.edu.cn.
更新日期/Last Update:
2018-01-03