[1]李锵,吴正彪,关欣.结合深度乐谱特征融合的钢琴指法生成方法[J].智能系统学报,2023,18(6):1287-1294.[doi:10.11992/tis.202303018]
LI Qiang,WU Zhengbiao,GUAN Xin.Piano fingering generation with deep musical score feature fusion[J].CAAI Transactions on Intelligent Systems,2023,18(6):1287-1294.[doi:10.11992/tis.202303018]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
18
期数:
2023年第6期
页码:
1287-1294
栏目:
学术论文—智能系统
出版日期:
2023-11-05
- Title:
-
Piano fingering generation with deep musical score feature fusion
- 作者:
-
李锵, 吴正彪, 关欣
-
天津大学 微电子学院, 天津 300072
- Author(s):
-
LI Qiang, WU Zhengbiao, GUAN Xin
-
School of Microelectronics, Tianjin University, Tianjin 300072, China
-
- 关键词:
-
人工智能; 音乐; 信息检索; 长短时记忆; 循环神经网络; 数据处理; 特征提取; 时间序列
- Keywords:
-
artificial intelligence; music; information retrieval; long short-term memory; recurrent neural networks; data processing; feature extraction; time series
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202303018
- 摘要:
-
指法是钢琴演奏的关键技术,但是除了初学者的教科书外,大多数乐谱都没有指法注释。目前用于钢琴指法自动生成的隐马尔可夫模型(hidden Markov model,HMM)和长短时记忆网络(long short-term memory,LSTM)模型,仅针对乐谱的音高建立模型,忽略同样影响指法的速度信息,存在对乐谱综合特征提取能力不足、生成的指法正确率低等问题。针对这些问题,设计一种可以同时利用乐谱的音高信息与速度信息的特征提取方法,并引入Word2Vec-CBOW(continuous bag-of-words)模型得到融合特征向量,根据人体左右手镜像对称的特点对原始数据进行左右手序列的数据增强与联合训练,最后结合双向长短时记忆网络-条件随机场(bidirectional LSTM conditional random field,BiLSTM-CRF)模型实现指法的生成。实验结果显示,本文提出的算法相比常用的统计学习方法和深度学习方法均有明显提高,验证了其合理性和有效性。
- Abstract:
-
Fingering is a key technique in piano playing. However, most musical scores have no finger notation except in beginners’ textbooks. The HMM and LSTM models used for automatic piano fingering only model pitch information and ignore speed information, which will influence the fingering. This condition results in insufficient extraction of comprehensive features and a low accuracy rate for generated fingerings. A feature extraction method was first designed using the pitch and speed information of the musical score simultaneously to address these problems. The Word2Vec-CBOW model was then introduced to produce a fused feature vector. Further, data enhancement and joint training of left and right hand sequences were conducted on the original data according to the mirror symmetric characteristics of human left and right hands. Finally, the generation of fingering was realized by combining the bidirectional long short-term memory network-conditional random field (BiLSTM-CRF) model. Experimental results show that the proposed algorithm is considerably better than commonly used statistical and deep learning methods, which confirms the rationality and effectiveness of the proposed model.
备注/Memo
收稿日期:2023-3-10。
基金项目:国家自然科学基金项目(61872267);天津市自然科学基金项目(16JCZDJC31100);天津大学创新基金项目(2021XZC-0024).
作者简介:李锵,教授,博士生导师,博士,主要研究方向为智能信息处理、医学图像处理、音乐信息检索、数字系统和微系统设计。发表学术论文130余篇;吴正彪,硕士研究生,主要研究方向为音乐信号处理、音乐信息检索;关欣,副教授,博士,主要研究方向为智能信息处理、统计学习和音乐信息检索。发表学术论文60余篇
通讯作者:关欣.E-mail:guanxin@tju.edu.cn
更新日期/Last Update:
1900-01-01