[1]SHI Hongbo,CHEN Yuwen,CHEN Xin.Summary of research on SMOTE oversampling and its improved algorithms[J].CAAI Transactions on Intelligent Systems,2019,14(6):1073-1083.[doi:10.11992/tis.201906052]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
14
Number of periods:
2019 6
Page number:
1073-1083
Column:
综述
Public date:
2019-11-05
- Title:
-
Summary of research on SMOTE oversampling and its improved algorithms
- Author(s):
-
SHI Hongbo; CHEN Yuwen; CHEN Xin
-
School of Information, Shanxi University of Finance and Economics, Taiyuan, Shanxi, 030031
-
- Keywords:
-
imbalanced data classification; SMOTE; algorithm; k-NN; oversampling; undersampling; high dimensional data; categorical data
- CLC:
-
TP391
- DOI:
-
10.11992/tis.201906052
- Abstract:
-
In recent years, the problem of imbalanced classification has received considerable attention. The synthetic minority oversampling technique (SMOTE), a popular method for improving the classification performance of imbalanced data, adds generated minority samples to change the distribution of imbalanced data sets. In this paper, we first describe the fundamentals, algorithms, and existing problems of SMOTE. Then, with respect to the existing problems of SMOTE, we introduce related research on four types of extension methods and three types of applications. Finally, to provide valuable reference information for the research and application of SMOTE, we analyze the existing difficulties of applying SMOTE to big data, streaming data, a small amount of label data, and other types of data.