[1]LI Qianqian,SHEN Xiaoyan,REN Fuji,et al.Investigation of multiple speech emotion classification algorithms based on data enhancement[J].CAAI Transactions on Intelligent Systems,2021,16(1):170-177.[doi:10.11992/tis.202103005]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
16
Number of periods:
2021 1
Page number:
170-177
Column:
吴文俊人工智能科学技术奖论坛
Public date:
2021-01-05
- Title:
-
Investigation of multiple speech emotion classification algorithms based on data enhancement
- Author(s):
-
LI Qianqian1; 2; SHEN Xiaoyan1; REN Fuji2; KANG Xin2
-
1. Institute of Information Science and Technology, Nantong University, Nantong 226019, China;
2. Department of Intelligent Information Engineering, Tokushima University, Tokushima 7708501, Japan
-
- Keywords:
-
speech emotion recognition; data enhancement; emotion feature; support vector machine; random forest; K-nearest neighbor; low-level description features; machine learning
- CLC:
-
TP181
- DOI:
-
10.11992/tis.202103005
- Abstract:
-
Currently, problems in speech emotion recognition, such as insufficient speech samples and numerous extracted and irrelevant features, make the recognition rate low. To solve the problem of insufficient speech samples, a time-frequency domain data enhancement method is proposed in the preprocessing stage to expand the original database. Considering the current situation where traditional algorithms extract a large amount of feature data and many are emotion-independent, 1582-dimensional emotion features and 10 groups of low-level description features were extracted. Finally, a comparative experiment was performed on three machine learning algorithms: the support vector machine, random forest, and K-nearest neighbor. Experiments showed that the average recognition rate of the support vector machine was superior. Among the ten sets of features, the accuracy of LogMelFreqBand in the three algorithms was 74.63%, 64.93%, and 66.42%, respectively, and the accuracy of pcm_fftMag_mfcc was 84.33%, 73.13%, and 58.21%, respectively.