[1]刘万军,孟仁杰,曲海成,等.基于增强AlexNet的音乐流派识别研究[J].智能系统学报,2020,15(4):750-757.[doi:10.11992/tis.201909032]
 LIU Wanjun,MENG Renjie,QU Haicheng,et al.Music genre recognition research based on enhanced AlexNet[J].CAAI Transactions on Intelligent Systems,2020,15(4):750-757.[doi:10.11992/tis.201909032]
点击复制

基于增强AlexNet的音乐流派识别研究(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第15卷
期数:
2020年4期
页码:
750-757
栏目:
学术论文—知识工程
出版日期:
2020-10-30

文章信息/Info

Title:
Music genre recognition research based on enhanced AlexNet
作者:
刘万军 孟仁杰 曲海成 刘腊梅
辽宁工程技术大学 软件学院,辽宁 葫芦岛 125105
Author(s):
LIU Wanjun MENG Renjie QU Haicheng LIU Lamei
College of Software, Liaoning Technical University, Huludao 125105, China
关键词:
音乐流派识别深度卷积神经网络机器学习深度学习AlexNet音频特征提取音乐特征识别
Keywords:
music genres recognitiondeep convolutional neural networkmachine learningdeep learningAlexNetaudio feature extractionaudio feature extraction
分类号:
TP181
DOI:
10.11992/tis.201909032
摘要:
针对机器学习模型对音乐流派特征识别能力较弱的问题,提出了一种基于深度卷积神经网络的音乐流派识别(DCNN-MGR)模型。该模型首先通过快速傅里叶变换提取音频信息,生成可以输入DCNN的频谱并切割生成频谱切片。然后通过融合带泄露整流(Leaky ReLU)函数、双曲正切(Tanh)函数和Softplus分类器对AlexNet进行增强。其次将生成的频谱切片输入增强的AlexNet进行多批次的训练与验证,提取并学习音乐特征,得到可以有效分辨音乐特征的网络模型。最后使用输出模型进行音乐流派识别测试。实验结果表明,增强的AlexNet在音乐特征识别准确率和网络收敛效果上明显优于AlexNet及其他常用的DCNN、DCNN-MGR模型在音乐流派识别准确率上比其他机器学习模型提升了4%~20%。
Abstract:
To solve the problem that machine learning model has weak ability to identify music genre features, a music genre recognition model based on deep convolutional neural network (DCNN-MGR) is proposed in this paper. At first, the model extracts audio information through Fast Fourier Transformation, generating spectrums that can be input to the DCNN and slicing the generated spectrums. Then AlexNet is enhanced by fusion of Leaky ReLU function, Tanh function and Softplus classifier. The generated spectrum slices are input into the enhanced AlexNet for multi-batch training and verification. Music features are extracted and learned, and a network model that can effectively distinguish music features is obtained. At last, the output model is applied to music genre recognition and test. The experimental results show that the enhanced AlexNet is superior to AlexNet and other commonly used DCNN in terms of accuracy of music feature recognition and network convergence effect. The DCNN-MGR model is 4%~20% higher than other machine learning models in music genre recognition accuracy.

参考文献/References:

[1] 邵曦, 姚磊. 基于SVM主动学习的音乐分类[J]. 计算机工程与应用, 2016, 52(6): 127-133
SHAO Xi, YAO Lei. Music classification based on SVM active learning[J]. Computer engineering and applications, 2016, 52(6): 127-133
[2] ALI M A, SIDDIQUI Z A. Automatic music genres classification using machine learning[J]. International journal of advanced computer science and applications, 2017, 8(8): 337-344.
[3] TZANETAKIS G, COOK P. Musical genre classification of audio signals[J]. IEEE transactions on speech and audio processing, 2002, 10(5): 293-302.
[4] MURAUER B, SPECHT G. Detecting music genre using extreme gradient boosting[C]//Companion of the The Web Conference 2018. Lyon, France, 2018: 1923-1927.
[5] 焦李成, 杨淑媛, 刘芳, 等. 神经网络七十年:回顾与展望[J]. 计算机学报, 2016, 39(8): 1697-1716
JIAO Licheng, YANG Shuyuan, LIU fang, et al. Seventy years beyond neural networks: retrospect and prospect[J]. Chinese journal of computers, 2016, 39(8): 1697-1716
[6] DAI Jia, LIANG Shan, XUE Wei, et al. Long short-term memory recurrent neural network based segment features for music genre classification[C]//2016 10th International Symposium on Chinese Spoken Language Processing. Tianjin, China, 2016: 1-5.
[7] JAKUBIK J. Evaluation of gated recurrent neural networks in music classification tasks[C]//International Conference on Information Systems Architecture and Technology. Szklarska Por?ba, Poland, 2017: 27-37.
[8] 马世龙, 乌尼日其其格, 李小平. 大数据与深度学习综述[J]. 智能系统学报, 2016, 11(6): 728-742
MA Shilong, WUNIRI Qiqige, LI Xiaoping. Deep learning with big data: state of the art and development[J]. CAAI transactions on intelligent systems, 2016, 11(6): 728-742
[9] 苗北辰, 郭为安, 汪镭, 等. 隐式特征和循环神经网络的多声部音乐生成系统[J]. 智能系统学报, 2019, 14(1): 158-164
MIAO Beichen, GUO Weian, WANG Lei. A polyphony music generation system based on latent features and a recurrent neural network[J]. CAAI transactions on intelligent systems, 2019, 14(1): 158-164
[10] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[C]//Advances in neural Information Processing Systems. Lake Tahoe, USA, 2012: 1097-1105.
[11] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. Computer science, 2014: 1409-1556.
[12] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015: 1-9.
[13] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016: 770-778.
[14] 吴进, 钱雪忠. 紧凑型深度卷积神经网络在图像识别中的应用[J]. 计算机科学与探索, 2019, 13(2): 275-284
WU Jin, QIAN Xuezhong. Compact deep convolutional neural network in image recognition[J]. Journal of frontiers of computer science and technology, 2019, 13(2): 275-284
[15] 吕鸿蒙, 赵地, 迟学斌. 基于增强AlexNet的深度学习的阿尔茨海默病的早期诊断[J]. 计算机科学, 2017, 44(6): 50-60
LYU Hongmeng, ZHAO Di, CHI Xuebin. Deep learning for early diagnosis of Alzheimer’s disease based on intensive AlexNet[J]. Computer science, 2017, 44(6): 50-60
[16] 甘岚, 郭子涵, 王瑶. 基于径向变换和改进AlexNet的胃肿瘤细胞图像识别方法[J]. 计算机应用, 2019, 39(10): 2923-2929
GAN Lan, GUO Zihan, Wang Yao. Gastric tumor cell image recognition method based on radial transformation and improved AlexNet[J]. Journal of computer applications, 2019, 39(10): 2923-2929
[17] 陈思文, 刘玉江, 刘冬, 等. 基于AlexNet模型和自适应对比度增强的乳腺结节超声图像分类[J]. 计算机科学, 2019, 46(6): 146-152
CHEN Siwen, LIU Yujiang, LIU Dong, et al. AlexNet model and adaptive contrast enhancement based ultrasound imaging classification[J]. Computer science, 2019, 46(6): 146-152
[18] 王文秀, 傅雨田, 董峰, 等. 基于深度卷积神经网络的红外船只目标检测方法[J]. 光学学报, 2018, 38(7): 160-166
WANG Wenxiu, FU Yutian, DONG Feng, et al. Infrared ship target detection method based on deep convolution neural network[J]. Acta optica sinica, 2018, 38(7): 160-166
[19] 李祥鹏, 闵卫东, 韩清, 等. 基于深度学习的车牌定位和识别方法[J]. 计算机辅助设计与图形学学报, 2019, 31(6): 979-987
LI Xiangpeng, MIN Weidong, HAN Qing, et al. License plate location and recognition based on deep learning[J]. Journal of computer-aided design & computer graphics, 2019, 31(6): 979-987
[20] 赵远东, 刘振宇, 柯丽, 等. 人脸识别中AlexNet网络设计和改进方法研究[J]. 通信技术, 2019, 52(3): 592-598
ZHAO Yuandong, LIU Zhenyu, Ke li, et al. Alexnet network design and improvement methods in face recogintion[J]. Communications technology, 2019, 52(3): 592-598
[21] 盖杉, 鲍中运. 基于改进深度卷积神经网络的纸币识别研究[J]. 电子与信息学报, 2019, 41(8): 1993-2000
GAI Shan, BAO Zhongyun. Banknote recognition research based on improved deep convolutional neural network[J]. Journal of electronics and information technology, 2019, 41(8): 1993-2000
[22] ZHENG Hao, YANG Zhanlei, LIU Wenju, et al. Improving deep neural networks using softplus units[C]//2015 International Joint Conference on Neural Networks. Killarney, Ireland, 2015: 1-4.
[23] 赵慧珍, 刘付显, 李龙跃. 一种新的深度卷积神经网络的SLU函数[J]. 哈尔滨工业大学学报, 2018, 50(4): 117-123
ZHAO Huizhen, LIU Fuxian, LI Longyue. A novel softplus linear unit for deep CNN[J]. Journal of Harbin institute of technology, 2018, 50(4): 117-123
[24] ELBIR A, ?LHAN H O, SERBES G, et al. Short time Fourier transform based music genre classification[C]//2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting. Istanbul, Turkey, 2018: 1-4.
[25] 毛勇华, 桂小林, 李前, 等. 深度学习技术应用研究[J]. 计算机应用研究, 2016, 33(11): 3201-3205
MAO Yonghua, GUI Xiaolin, LI Qian, et al. Study on application technology of deep learning[J]. Application research of computers, 2016, 33(11): 3201-3205

相似文献/References:

[1]夏洋洋,龚勋,洪西进.人脸识别背后的数据清理问题研究[J].智能系统学报,2017,12(05):616.[doi:10.11992/tis.201706025]
 XIA Yangyang,GONG Xun,HONG Xijin.Research on the data cleansing problem for face recognition technology[J].CAAI Transactions on Intelligent Systems,2017,12(4):616.[doi:10.11992/tis.201706025]

备注/Memo

备注/Memo:
收稿日期:2019-09-16。
基金项目:国家自然科学基金青年基金项目(41701479)
作者简介:刘万军,教授,主要研究方向为数字图像处理、运动目标检测与跟踪。主持国家级和省部级科研项目20余项。发表学术论文120余篇;孟仁杰,硕士研究生,主要研究方向为深度学习、自然语言处理;曲海成,副教授,主要研究方向为高光谱遥感图像处理、GPU并行计算。主持辽宁省科技厅和教育厅一般项目各1项,参与国家自然基金项目2项。发表学术论文30余篇
通讯作者:孟仁杰.E-mail:mengrenjie95@163.com
更新日期/Last Update: 2020-07-25