[1]MA Zhiqiang,LI Tuya,YANG Shuangtao,et al.Mongolian acoustic modeling based on deep neural network[J].CAAI Transactions on Intelligent Systems,2018,13(3):486-492.[doi:10.11992/tis.201710029]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
13
Number of periods:
2018 3
Page number:
486-492
Column:
学术论文—自然语言处理与理解
Public date:
2018-05-05
- Title:
-
Mongolian acoustic modeling based on deep neural network
- Author(s):
-
MA Zhiqiang; LI Tuya; YANG Shuangtao; ZHANG Li
-
School of Data Science & Application, Inner Mongolia University of Technology, Hohhot 010080, China
-
- Keywords:
-
speech recognition; acoustic model; GMM-HMM; DNN-HMM; supervised learning; pre-training; over-fitting; dropout
- CLC:
-
TP391
- DOI:
-
10.11992/tis.201710029
- Abstract:
-
Considering the difficulty of using the Gaussian mixture model (GMM) to adequately describe the correlation and independence hypothesis of the Mongolian acoustic features in the acoustic modeling of Mongolian speech recognition, this study investigates an acoustic model based on deep neural network (DNN). Firstly, using DNN, the internal structure of phonetic features were classified and learned to extract the Mongolian acoustic features, and a DNN-HMM Mongolian acoustic model was constructed. Secondly, a training algorithm was designed by combining unsupervised pre-training and supervised training tuning. In addition, dropout technology was added into the DNN-HMM Mongolian acoustic model training to avoid the over-fitting phenomenon. Finally, a comparative experiment was conducted for the GMM-HMM and DNN-HMM Mongolian acoustic models on basis of the small-scale corpus and Kaldi experimental platform. Experimental results show that the word recognition error rate of DNN-HMM Mongolian model was reduced by 7.5% and sentence recognition error rate was reduced by 13.63%. In addition, the over-fitting of DNN-HMM Mongolian acoustic model can be effectively avoided by adopting the dropout technique during training.