[1]高庆吉,赵志华,徐达,等.语音情感识别研究综述[J].智能系统学报,2020,15(1):1-13.[doi:10.11992/tis.201904065]
 GAO Qingji,ZHAO Zhihua,XU Da,et al.Review on speech emotion recognition research[J].CAAI Transactions on Intelligent Systems,2020,15(1):1-13.[doi:10.11992/tis.201904065]
点击复制

语音情感识别研究综述(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第15卷
期数:
2020年1期
页码:
1-13
栏目:
综述
出版日期:
2020-01-01

文章信息/Info

Title:
Review on speech emotion recognition research
作者:
高庆吉 赵志华 徐达 邢志伟
中国民航大学 电子信息与自动化学院, 天津 300300
Author(s):
GAO Qingji ZHAO Zhihua XU Da XING Zhiwei
College of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, China
关键词:
深度学习情感语音数据库情感描述模型语音情感特征特征提取特征降维情感分类情感回归
Keywords:
deep learningsentiment speech databasessentiment description modelsacoustic sentiment featuresfeature extractionfeature reductionsentiment classificationsentiment regression
分类号:
TP391
DOI:
10.11992/tis.201904065
摘要:
针对语音情感识别研究体系进行综述。这一体系包括情感描述模型、情感语音数据库、特征提取与降维、情感分类与回归算法4个方面的内容。本文总结离散情感模型、维度情感模型和两模型间单向映射的情感描述方法;归纳出情感语音数据库选择的依据;细化了语音情感特征分类并列出了常用特征提取工具;最后对特征提取和情感分类与回归的常用算法特点进行凝练并总结深度学习研究进展,并提出情感语音识别领域需要解决的新问题、预测了发展趋势。
Abstract:
In this paper, the research system of speech emotion recognition is summarized. The system includes four aspects: emotion description models, emotion speech database, feature extraction and dimensionality reduction, sentiment classification and regression algorithms. Firstly, we sum up the emotional description method of discrete emotion model, dimensional emotion model and one-way mapping between two models, then conclude the basis of emotional speech database selection, and then refine the classification of speech emotion features and list common tools for extracting the characteristics, and finally, extract the features of common algorithms, such as feature extraction, emotion classification and regression, and make a conclusion of the progress made in deep-learning research. In addition, we also propose some problems that need to be solved in this field and predict development trend.

参考文献/References:

[1] PRAVENA D, GOVIND D. Significance of incorporating excitation source parameters for improved emotion recognition from speech and electroglottographic signals[J]. International journal of speech technology, 2017, 20(4): 787–797.
[2] MIXDORFF H, H?NEMANN A, RILLIARD A, et al. Audio-visual expressions of attitude: how many different attitudes can perceivers decode?[J]. Speech communication, 2017, 95: 114–126.
[3] BUITELAAR P, WOOD I D, NEGI S, et al. MixedEmotions: an open-source toolbox for multimodal emotion analysis[J]. IEEE transactions on multimedia, 2018, 20(9): 2454–2465.
[4] SAPI?SKI T, KAMI?SKA D, PELIKANT A, et al. Emotion recognition from skeletal movements[J]. Entropy, 2019, 21(7): 646.
[5] PARIS M, MAHAJAN Y, KIM J, et al. Emotional speech processing deficits in bipolar disorder: the role of mismatch negativity and P3a[J]. Journal of affective disorders, 2018, 234: 261–269.
[6] SCHELINSKI S, VON KRIEGSTEIN K. The relation between vocal pitch and vocal emotion recognition abilities in people with autism spectrum disorder and typical development[J]. Journal of autism and developmental disorders, 2019, 49(1): 68–82.
[7] SWAIN M, ROUTRAY A, KABISATPATHY P. Databases, features and classifiers for speech emotion recognition: a review[J]. International journal of speech technology, 2018, 21(1): 93–120.
[8] 韩文静, 李海峰, 阮华斌, 等. 语音情感识别研究进展综述[J]. 软件学报, 2014, 25(1): 37–50
HAN Wenjing, LI Haifeng, RUAN Huabin, et al. Review on speech emotion recognition[J]. Journal of software, 2014, 25(1): 37–50
[9] 刘振焘, 徐建平, 吴敏, 等. 语音情感特征提取及其降维方法综述[J]. 计算机学报, 2018, 41(12): 2833–2851
LIU Zhentao, XU Jianping, WU Min, et al. Review of emotional feature extraction and dimension reduction method for speech emotion recognition[J]. Chinese journal of computers, 2018, 41(12): 2833–2851
[10] KRATZWALD B, ILI? S, KRAUS M, et al. Deep learning for affective computing: text-based emotion recognition in decision support[J]. Decision support systems, 2018, 115: 24–35.
[11] ORTONY A, TURNER T J. What’s basic about basic emotions?[J]. Psychological review, 1990, 97(3): 315–331.
[12] EKMAN P, FRIESEN W V, O’SULLIVAN M, et al. Universals and cultural differences in the judgments of facial expressions of emotion[J]. Journal of personality and social psychology, 1987, 53(4): 712–717.
[13] SCHULLER B W. Speech emotion recognition: two decades in a nutshell, benchmarks, and ongoing trends[J]. Communications of the ACM, 2018, 61(5): 90–99.
[14] 乐国安, 董颖红. 情绪的基本结构: 争论、应用及其前瞻[J]. 南开学报(哲学社会科学版), 2013(1): 140–150
YUE Guoan, DONG Yinghong. On the categorical and dimensional approaches of the theories of the basic structure of emotions[J]. Nankai journal (philosophy, literature and social science edition), 2013(1): 140–150
[15] 李霞, 卢官明, 闫静杰, 等. 多模态维度情感预测综述[J]. 自动化学报, 2018, 44(12): 2142–2159
LI Xia, LU Guanming, YAN Jingjie, et al. A survey of dimensional emotion prediction by multimodal cues[J]. Acta automatica sinica, 2018, 44(12): 2142–2159
[16] FONTAINE J R J, SCHERER K R, ROESCH E B, et al. The world of emotions is not two-dimensional[J]. Psychological science, 2007, 18(12): 1050–1057.
[17] RUSSELL J A. A circumplex model of affect[J]. Journal of personality and social psychology, 1980, 39(6): 1161–1178.
[18] YIK M S M, RUSSELL J A, BARRETT L F. Structure of self-reported current affect: integration and beyond[J]. Journal of personality and social psychology, 1999, 77(3): 600–619.
[19] PLUTCHIK R. The nature of emotions: human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice[J]. American scientist, 2001, 89(4): 344–350.
[20] ZHALEHPOUR S, ONDER O, AKHTAR Z, et al. BAUM-1: a spontaneous audio-visual face database of affective and mental states[J]. IEEE transactions on affective computing, 2017, 8(3): 300–313.
[21] WANG Wenwu. Machine audition: principles, algorithms and systems[M]. New York: Information Science Reference, 2010, 398–423.
[22] WANG Yongjin, GUAN Ling. Recognizing human emotional state from audiovisual signals[J]. IEEE transactions on multimedia, 2008, 10(4): 659–668.
[23] BURKHARDT F, PAESCHKE A, ROLFES M, et al. A database of German emotional speech[C]//INTERSPEECH 2005. Lisbon, Portugal, 2005: 1517–1520.
[24] MARTIN O, KOTSIA I, MACQ B, et al. The eNTERFACE’ 05 audio-visual emotion database[C]//Proceedings of the 22nd International Conference on Data Engineering Workshops. Atlanta, USA, 2006: 1–8.
[25] LIVINGSTONE S R, RUSSO F A. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in north American English[J]. PLoS one, 2018, 13(5): e0196391.
[26] STEIDL S. Automatic classification of emotion-related user states in spontaneous children’s speech[M]. Erlangen, Germany: University of Erlangen-Nuremberg, 2009: 1–250
[27] GRIMM M, KROSCHEL K, NARAYANAN S. The Vera am Mittag German audio-visual emotional speech database[C]//Proceedings of 2008 IEEE International Conference on Multimedia and Expo. Hannover, Germany, 2008: 865–868.
[28] BUSSO C, BULUT M, LEE C C, et al. IEMOCAP: interactive emotional dyadic motion capture database[J]. Language resources and evaluation, 2008, 42(4): 335–359.
[29] RINGEVAL F, SONDEREGGER A, SAUER J, et al. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions[C]//Proceedings of the 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. Shanghai, China, 2013: 1–8.
[30] METALLINOU A, YANG Zhaojun, LEE C, et al. The USC CreativeIT database of multimodal dyadic interactions: from speech and full body motion capture to continuous emotional annotations[J]. Language resources and evaluation, 2016, 50(3): 497–521.
[31] MCKEOWN G, VALSTAR M, COWIE R, et al. The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent[J]. IEEE transactions on affective computing, 2012, 3(1): 5–17.
[32] 饶元, 吴连伟, 王一鸣, 等. 基于语义分析的情感计算技术研究进展[J]. 软件学报, 2018, 29(8): 2397–2426
RAO Yuan, WU Lianwei, WANG Yiming, et al. Research progress on emotional computation technology based on semantic analysis[J]. Journal of software, 2018, 29(8): 2397–2426
[33] WANG Yiming, RAO Yuan, WU Lianwei. A review of sentiment semantic analysis technology and progress[C]//Proceedings of 2017 13th International Conference on Computational Intelligence and Security. Hong Kong, China, 2017: 452–455.
[34] MORRIS J D. Observations: SAM: the self-assessment manikin—an efficient cross-cultural measurement of emotional response[J]. Journal of advertising research, 1995, 35(6): 63–68.
[35] 夏凡, 王宏. 多模态情感数据标注方法与实现[C]//第一届建立和谐人机环境联合学术会议(HHME2005)论文集. 北京, 2005: 1481–1487.
XIA Fan, WANG Hong. Multi-modal affective annotation method and implementation[C]//The 1th Jiont Conference on Harmonious Human Machine Environment (HHME2005). Beijing, 2005: 1481–1487.
[36] COWIE R, DOUGLAS-COWIE E, SAVVIDOU S, et al. FEELTRACE: an instrument for recording perceived emotion in real time[C]//Proceedings of the 2000 ISCA Tutorial and Research Workshop on Speech and Emotion. Newcastle, United Kingdom, 2000: 19–24.
[37] 陈炜亮, 孙晓. 基于MFCCG-PCA的语音情感识别[J]. 北京大学学报(自然科学版), 2015, 51(2): 269–274
CHEN Weiliang, SUN Xiao. Mandarin speech emotion recognition based on MFCCG-PCA[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2015, 51(2): 269–274
[38] SUN Linhui, FU Sheng, WANG Fu. Decision tree SVM model with Fisher feature selection for speech emotion recognition[J]. EURASIP journal on audio, speech, and music processing, 2019, 2019: 2.
[39] NASSIF A B, SHAHIN I, ATTILI I, et al. Speech recognition using deep neural networks: a systematic review[J]. IEEE access, 2019, 7: 19143–19165.
[40] ZHANG Shiqing, ZHANG Shiliang, HUANG Tiejun, et al. Speech emotion recognition using deep convolutional neural network and discriminant temporal pyramid matching[J]. IEEE transactions on multimedia, 2018, 20(6): 1576–1590.
[41] 陈逸灵, 程艳芬, 陈先桥, 等. PAD三维情感空间中的语音情感识别[J]. 哈尔滨工业大学学报, 2018, 50(11): 160–166
CHEN Yiling, CHENG Yanfen, CHEN Xianqiao, et al. Speech emotionestimation in PAD 3D emotion space[J]. Journal of Harbin Institute of Technology, 2018, 50(11): 160–166
[42] 王玮蔚, 张秀再. 基于变分模态分解的语音情感识别方法[J]. 应用声学, 2019, 38(2): 237–244
WANG Weiwei, ZHANG Xiuzai. Speech emotion recognition based on variational mode decomposition[J]. Journal of applied acoustics, 2019, 38(2): 237–244
[43] 王忠民, 刘戈, 宋辉. 基于多核学习特征融合的语音情感识别[J]. 计算机工程, 2019, 45(08): 248–254
WANG Zhongmin, LIU Ge, SONG Hui. Feature fusion based on multiple kernal learning for speech emotion recognition[J]. Computer engineering, 2019, 45(08): 248–254
[44] 卢官明, 袁亮, 杨文娟, 等. 基于长短期记忆和卷积神经网络的语音情感识别[J]. 南京邮电大学学报(自然科学版), 2018, 38(5): 63–69
LU Guanming, YUAN Liang, YANG Wenjuan, et al. Speech emotion recognition based on long short-term memory and convolutional neural networks[J]. Journal of Nanjing University of Posts and Telecommunications (Natural Science Edition), 2018, 38(5): 63–69
[45] ?ZSEVEN T. Investigation of the effect of spectrogram images and different texture analysis methods on speech emotion recognition[J]. Applied acoustics, 2018, 142: 70–77.
[46] JIANG Wei, WANG Zheng, JIN J S, et al. Speech emotion recognition with heterogeneous feature unification of deep neural network[J]. Sensors, 2019, 19(12): 2730.
[47] TORRES-BOZA D, OVENEKE M C, WANG Fengna, et al. Hierarchical sparse coding framework for speech emotion recognition[J]. Speech communication, 2018, 99: 80–89.
[48] MAO Qirong, XUE Wentao, RAO Qiru, et al. Domain adaptation for speech emotion recognition by sharing priors between related source and target classes[C]//Proceedings of 2016 IEEE International Conference on Acoustics, Speech and Signal Processing. Shanghai, China, 2016: 2608–2612.
[49] JIN Qin, LI Chengxin, CHEN Shizhe, et al. Speech emotion recognition with acoustic and lexical features[C]//Proceedings of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. Brisbane, QLD, Australia, 2015: 4749–4753.
[50] SCHULLER B. Recognizing affect from linguistic information in 3D continuous space[J]. IEEE transactions on affective computing, 2011, 2(4): 192–205.
[51] DIMOULAS C A, KALLIRIS G M. Investigation of wavelet approaches for joint temporal, spectral and cepstral features in audio semantics[C]//Audio Engineering Society Convention. New York, USA, 2013.
[52] TAWARI A, TRIVEDI M M. Speech emotion analysis: exploring the role of context[J]. IEEE transactions on multimedia, 2010, 12(6): 502–509.
[53] QUIROS-RAMIREZ M A, ONISAWA T. Considering cross-cultural context in the automatic recognition of emotions[J]. International journal of machine learning and cybernetics, 2015, 6(1): 119–127.
[54] WU Xixin, LIU Songxiang, CAO Yuewen, et al. Speech emotion recognition using capsule networks[C]//ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom, 2019: 6695–6699.
[55] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, United States, 2012: 1097–1105.
[56] ZHAO Jianfeng, MAO Xia, CHEN Lijiang. Speech emotion recognition using deep 1D & 2D CNN LSTM networks[J]. Biomedical signal processing and control, 2019, 47: 312–323.
[57] 张丽, 吕军, 强彦, 等. 基于深度信念网络的语音情感识别[J]. 太原理工大学学报, 2019, 50(1): 101–107
ZHANG Li, LV Jun, QIANG Yan, et al. Emotion recognition based on deep belief network[J]. Journal of Taiyuan University of Technology, 2019, 50(1): 101–107
[58] ABDELWAHAB M, BUSSO C. Domain adversarial for acoustic emotion recognition[J]. IEEE/ACM transactions on audio, speech, and language processing, 2018, 26(12): 2423–2435.
[59] MENG Zhong, LI Jinyu, CHEN Zhuo, et al. Speaker-invariant training via adversarial learning[C]//Proceedings of 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. Calgary, AB, Canada, 2018: 5969–5973.
[60] VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE[J]. Journal of machine learning research, 2008, 9: 2579–2605.
[61] BOERSMA P, WEENINK D. Praat, a system for doing phonetics by computer[J]. Glot international, 2002, 5(9/10): 341–345.
[62] EYBEN F, W?LLMER M, SCHULLER B. Opensmile: the Munich versatile and fast open-source audio feature extractor[C]//Proceedings of the 18th ACM International Conference on Multimedia. Firenze, Italy, 2010: 1459–1462.
[63] ?ZSEVEN T, Dü?ENCI M. SPeech ACoustic (SPAC): a novel tool for speech feature extraction and classification[J]. Applied acoustics, 2018, 136: 1–8.
[64] 孙凌云, 何博伟, 刘征, 等. 基于语义细胞的语音情感识别[J]. 浙江大学学报(工学版), 2015, 49(6): 1001–1008
SUN Lingyun, HE Bowei, LIU Zheng, et al. Speech emotion recognition based on information cell[J]. Journal of Zhejiang University (Engineering Science), 2015, 49(6): 1001–1008
[65] SCHULLER B, BATLINER A, STEIDL S, et al. Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge[J]. Speech communication, 2011, 53(9/10): 1062–1087.
[66] GHARAVIAN D, BEJANI M, SHEIKHAN M. Audio-visual emotion recognition using FCBF feature selection method and particle swarm optimization for fuzzy ARTMAP neural networks[J]. Multimedia tools and applications, 2016, 76(2): 2331–2352.
[67] 王艳, 胡维平. 基于BP特征选择的语音情感识别[J]. 微电子学与计算机, 2019, 36(5): 14–18
WANG Yan, HU Weiping. Speech emotion recognition based on BP feature selection[J]. Microelectronics & computer, 2019, 36(5): 14–18
[68] 孙颖, 姚慧, 张雪英, 等. 基于混沌特性的情感语音特征提取[J]. 天津大学学报(自然科学与工程技术版), 2015, 48(8): 681–685
SUN Ying, YAO Hui, ZHANG Xueying, et al. Feature extraction of emotional speech based on chaotic characteristics[J]. Journal of Tianjin University (Science and Technology), 2015, 48(8): 681–685
[69] 宋鹏, 郑文明, 赵力. 基于子空间学习和特征选择融合的语音情感识别[J]. 清华大学学报(自然科学版), 2018, 58(4): 347–351
SONG Peng, ZHENG Wenming, ZHAO Li. Joint subspace learning and feature selection method for speech emotion recognition[J]. Journal of Tsinghua University (Science and Technology), 2018, 58(4): 347–351
[70] EYBEN F, SCHERER K R, SCHULLER B W, et al. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for voice research and affective computing[J]. IEEE transactions on affective computing, 2016, 7(2): 190–202.
[71] ?ZSEVEN T. A novel feature selection method for speech emotion recognition[J]. Applied acoustics, 2019, 146: 320–326.
[72] 姜晓庆, 夏克文, 夏莘媛, 等. 采用半定规划多核SVM的语音情感识别[J]. 北京邮电大学学报, 2015, 38(S1): 67–71
JIANG Xiaoqing, XIA Kewen, XIA Xinyuan, et al. Speech emotion recognition using semi-definite programming multiple-kernel SVM[J]. Journal of Beijing University of Posts and Telecommunications, 2015, 38(S1): 67–71
[73] ZHENG Weiqiao, YU Jiasheng, ZOU Yuexian. An experimental study of speech emotion recognition based on deep convolutional neural networks[C]//Proceedings of 2015 International Conference on Affective Computing and Intelligent Interaction. Xi’an, China, 2015: 827–831.
[74] SHAHIN I, NASSIF A B, HAMSA S. Emotion recognition using hybrid Gaussian mixture model and deep neural network[J]. IEEE access, 2019, 7: 26777–26787.
[75] SAGHA H, JUN Deng, GAVRYUKOVA M, et al. Cross lingual speech emotion recognition using canonical correlation analysis on principal component subspace[C]//Proceedings of 2016 IEEE International Conference on Acoustics, Speech and Signal Processing. Shanghai, China, 2016: 5800–5804.
[76] 陈师哲, 王帅, 金琴. 多文化场景下的多模态情感识别[J]. 软件学报, 2018, 29(4): 1060–1070
CHEN Shizhe, WANG Shuai, JIN Qin. Multimodal emotion recognition in multi-cultural conditions[J]. Journal of software, 2018, 29(4): 1060–1070
[77] 刘颖, 贺聪, 张清芳. 基于核相关分析算法的情感识别模型[J]. 吉林大学学报(理学版), 2017, 55(6): 1539–1544
LIU Ying, HE Cong, ZHANG Qingfang. Emotion recognition model based on kernel correlation analysis algorithm[J]. Journal of Jilin University (Science Edition), 2017, 55(6): 1539–1544
[78] MA Yaxiong, HAO Yixue, CHEN Min, et al. Audio-Visual Emotion Fusion (AVEF): a deep efficient weighted approach[J]. Information fusion, 2019, 46: 184–192.
[79] HUANG Yongming, TIAN Kexin, WU Ao, et al. Feature fusion methods research based on deep belief networks for speech emotion recognition under noise condition[J]. Journal of ambient intelligence and humanized computing, 2019, 10(5): 1787–1798.
[80] MANNEPALLI K, SASTRY P N, SUMAN M. A novel adaptive fractional deep belief networks for speaker emotion recognition[J]. Alexandria engineering journal, 2017, 56(4): 485–497.
[81] XU Xinzhou, DENG Jun, COUTINHO E, et al. Connecting subspace learning and extreme learning machine in speech emotion recognition[J]. IEEE transactions on multimedia, 2019, 21(3): 795–808.
[82] TON-THAT A H, CAO N T. Speech emotion recognition using a fuzzy approach[J]. Journal of intelligent & fuzzy systems, 2019, 36(2): 1587–1597.
[83] ZHANG Biqiao, PROVOST E M, ESSL G. Cross-corpus acoustic emotion recognition with multi-task learning: seeking common ground while preserving differences[J]. IEEE transactions on affective computing, 2019, 10(1): 85–99.
[84] HUANG Kunyi, WU C H, SU M H. Attention-based convolutional neural network and long short-term memory for short-term detection of mood disorders based on elicited speech responses[J]. Pattern recognition, 2019, 88: 668–678.
[85] YOON S A, SON G, KWON S. Fear emotion classification in speech by acoustic and behavioral cues[J]. Multimedia tools and applications, 2019, 78(2): 2345–2366.
[86] 袁非牛, 章琳, 史劲亭, 等. 自编码神经网络理论及应用综述[J]. 计算机学报, 2019, 42(1): 203–230
YUAN Feiniu, ZHANG Lin, SHI Jinting, et al. Theories and applications of auto-encoder neural networks: a literature survey[J]. Chinese journal of computers, 2019, 42(1): 203–230
[87] 林懿伦, 戴星原, 李力, 等. 人工智能研究的新前线: 生成式对抗网络[J]. 自动化学报, 2018, 44(5): 775–792
LIN Yilun, DAI Xingyuan, LI Li, et al. The new frontier of AI research: generative adversarial networks[J]. Acta automatica sinica, 2018, 44(5): 775–792
[88] ZHOU Jie, HUANG J X, CHEN Qin, et al. Deep learning for aspect-level sentiment classification: survey, vision, and challenges[J]. IEEE access, 2019, 7: 78454–78483.
[89] O’SHAUGHNESSY D. Recognition and processing of speech signals using neural networks[J]. Circuits, systems, and signal processing, 2019, 38(8): 3454–3481.
[90] XIE Yue, LIANG Ruiyu, LIANG Zhenlin, et al. Attention-based dense LSTM for speech emotion recognition[J]. IEICE transactions on information and systems, 2019, E102.D(7): 1426–1429.
[91] PEI Jing, DENG Lei, SONG Sen, et al. Towards artificial general intelligence with hybrid Tianjic chip architecture. Nature, 2019, 572: 106-111
[92] QIAN Yongfeng, LU Jiayi, MIAO Yiming, et al. AIEM: AI-enabled affective experience management[J]. Future generation computer systems, 2018, 89: 438–445.
[93] LADO-CODESIDO M, PéREZ C M, MATEOS R, et al. Improving emotion recognition in schizophrenia with “VOICES”: an on-line prosodic self-training[J]. PLoS one, 2019, 14(1): e0210816.
[94] CUMMINS N, BAIRD N, SCHULLER B W. Speech analysis for health: current state-of-the-art and the increasing impact of deep learning[J]. Methods, 2018, 151: 41–54.
[95] LIU Zhentao, XIE Qiao, WU Min, et al. Speech emotion recognition based on an improved brain emotion learning model[J]. Neurocomputing, 2018, 309: 145–156.

相似文献/References:

[1]张媛媛,霍静,杨婉琪,等.深度信念网络的二代身份证异构人脸核实算法[J].智能系统学报,2015,10(02):193.[doi:10.3969/j.issn.1673-4785.201405060]
 ZHANG Yuanyuan,HUO Jing,YANG Wanqi,et al.A deep belief network-based heterogeneous face verification method for the second-generation identity card[J].CAAI Transactions on Intelligent Systems,2015,10(1):193.[doi:10.3969/j.issn.1673-4785.201405060]
[2]丁科,谭营.GPU通用计算及其在计算智能领域的应用[J].智能系统学报,2015,10(01):1.[doi:10.3969/j.issn.1673-4785.201403072]
 DING Ke,TAN Ying.A review on general purpose computing on GPUs and its applications in computational intelligence[J].CAAI Transactions on Intelligent Systems,2015,10(1):1.[doi:10.3969/j.issn.1673-4785.201403072]
[3]马晓,张番栋,封举富.基于深度学习特征的稀疏表示的人脸识别方法[J].智能系统学报,2016,11(3):279.[doi:10.11992/tis.201603026]
 MA Xiao,ZHANG Fandong,FENG Jufu.Sparse representation via deep learning features based face recognition method[J].CAAI Transactions on Intelligent Systems,2016,11(1):279.[doi:10.11992/tis.201603026]
[4]刘帅师,程曦,郭文燕,等.深度学习方法研究新进展[J].智能系统学报,2016,11(5):567.[doi:10.11992/tis.201511028]
 LIU Shuaishi,CHENG Xi,GUO Wenyan,et al.Progress report on new research in deep learning[J].CAAI Transactions on Intelligent Systems,2016,11(1):567.[doi:10.11992/tis.201511028]
[5]马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报,2016,11(6):728.[doi:10.11992/tis.201611021]
 MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,11(1):728.[doi:10.11992/tis.201611021]
[6]王亚杰,邱虹坤,吴燕燕,等.计算机博弈的研究与发展[J].智能系统学报,2016,11(6):788.[doi:10.11992/tis.201609006]
 WANG Yajie,QIU Hongkun,WU Yanyan,et al.Research and development of computer games[J].CAAI Transactions on Intelligent Systems,2016,11(1):788.[doi:10.11992/tis.201609006]
[7]黄心汉.A3I:21世纪科技之光[J].智能系统学报,2016,11(6):835.[doi:10.11992/tis.201605022]
 HUANG Xinhan.A3I: the star of science and technology for the 21st century[J].CAAI Transactions on Intelligent Systems,2016,11(1):835.[doi:10.11992/tis.201605022]
[8]宋婉茹,赵晴晴,陈昌红,等.行人重识别研究综述[J].智能系统学报,2017,12(06):770.[doi:10.11992/tis.201706084]
 SONG Wanru,ZHAO Qingqing,CHEN Changhong,et al.Survey on pedestrian re-identification research[J].CAAI Transactions on Intelligent Systems,2017,12(1):770.[doi:10.11992/tis.201706084]
[9]杨梦铎,栾咏红,刘文军,等.基于自编码器的特征迁移算法[J].智能系统学报,2017,12(06):894.[doi:10.11992/tis.201706037]
 YANG Mengduo,LUAN Yonghong,LIU Wenjun,et al.Feature transfer algorithm based on an auto-encoder[J].CAAI Transactions on Intelligent Systems,2017,12(1):894.[doi:10.11992/tis.201706037]
[10]王科俊,赵彦东,邢向磊.深度学习在无人驾驶汽车领域应用的研究进展[J].智能系统学报,2018,13(01):55.[doi:10.11992/tis.201609029]
 WANG Kejun,ZHAO Yandong,XING Xianglei.Deep learning in driverless vehicles[J].CAAI Transactions on Intelligent Systems,2018,13(1):55.[doi:10.11992/tis.201609029]

备注/Memo

备注/Memo:
收稿日期:2019-04-27。
基金项目:国家自然科学基金委员会?中国民航局民航联合研究基金项目(U1533203)
作者简介:高庆吉,教授,博士,中国航空学会青年工作委员会副主任委员,制导导航与控制委员会委员,主要研究方向为人工智能、智能机器人,主持并完成国家自然科学基金项目、“863”计划项目及省部级科研项目10余项。发表学术论文百余篇;赵志华,硕士研究生,主要研究方向为多模态情感计算、情感识别;徐达,硕士研究生,主要研究方向为机器学习、情感识别
通讯作者:赵志华.E-mail:657902648@qq.com
更新日期/Last Update: 1900-01-01