[1]WANG Junhong,DUAN Bingqian.Research on the SMOTE method based on density[J].CAAI Transactions on Intelligent Systems,2017,12(6):865-872.[doi:10.11992/tis.201706049]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
12
Number of periods:
2017 6
Page number:
865-872
Column:
学术论文—机器学习
Public date:
2017-12-25
- Title:
-
Research on the SMOTE method based on density
- Author(s):
-
WANG Junhong; DUAN Bingqian
-
School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
-
- Keywords:
-
imbalance; classification; sampling; precision; density
- CLC:
-
TP311
- DOI:
-
10.11992/tis.201706049
- Abstract:
-
In recent years, over-sampling has been widely used in the field of classification of imbalanced classes. The SMOTE(Synthetic Minority Oversampling Technique) algorithm, presented by Chawla, alleviates the degree of data imbalance to a certain extent, but can lead to over-fitting. To solve this problem, this paper presents a new sampling method, DS-SMOTE, which identifies sparse samples based on their density and uses them as seed samples in the process of sampling. The SMOTE algorithm is then adopted, and a synthetic sample is generated between the seed sample and its k neighbor. The proposed algorithm showed great improvement in precision and G-mean compared with similar algorithms, and it has advantage of treating imbalanced data classification.