[1]李一熙,汪镭,薛愈,等.基于短时傅里叶变换的智能音乐生成系统分析与研究[J].智能系统学报,2025,20(3):750-760.[doi:10.11992/tis.202405043]
 LI Yixi,WANG Lei,XUE Yu,et al.Research on window function analysis in STFT-based intelligent music generation system[J].CAAI Transactions on Intelligent Systems,2025,20(3):750-760.[doi:10.11992/tis.202405043]
点击复制

基于短时傅里叶变换的智能音乐生成系统分析与研究

参考文献/References:
[1] 王一权, 任之初, 邵曦, 等. 高精度复调乐音识别方法[J]. 计算机应用, 2023, 43(S2): 244-249.
WANG Yiquan, REN Zhichu, SHAO Xi, et al. High precision polyphonic music recognition method[J]. Journal of computer applications, 2023, 43(S2): 244-249.
[2] 李飞龙, 和伟辉, 刘立芳, 等. 结合CWT和LightweightNet的滚动轴承实时故障诊断方法[J]. 智能系统学报, 2023, 18(3): 496-505.
LI Feilong, HE Weihui, LIU Lifang, et al. Real time fault diagnosis method of rolling bearing based on CWT and LightweightNet[J]. CAAI transactions on intelligent systems, 2023, 18(3): 496-505.
[3] 杜婷婷. 超声成像最小方差自适应波束形成改进算法研究[D]. 重庆: 重庆大学, 2020.
DU Tingting. Research on improved algorithm of minimum variance adaptive beamforming for ultrasonic imaging[D]. Chongqing: Chongqing University, 2020.
[4] 卢恋, 任伟新, 王世东. 基于Kaiser窗的分数阶Fourier变换与时频分析[J]. 振动工程学报, 2023, 36(3): 698-705.
LU Lian, REN Weixin, WANG Shidong. Fractional Fourier transform based Kaiser window and time-frequency analysis[J]. Journal of vibration engineering, 2023, 36(3): 698-705.
[5] SIGTIA S, BENETOS E, DIXON S. An end-to-end neural network for polyphonic piano music transcription[J]. IEEE/ACM transactions on audio, speech, and language processing, 2016, 24(5): 927-939.
[6] PELCHAT N, GELOWITZ C M. Neural network music genre classification[J]. Canadian journal of electrical and computer engineering, 2020, 43(3): 170-173.
[7] MARAFIOTI A, HOLIGHAUS N, PERRAUDIN N, et al. Adversarial generation of time-frequency features with application in audio synthesis[EB/OL]. (2019-05-16)[2024-01-01]. https://arxiv.org/abs/1902.04072v2.
[8] DECORSIèRE R J B, S?NDERGAARD P L, MACDONALD E N, et al. Inversion of auditory spectrograms, traditional spectrograms, and other envelope representations[J]. IEEE/ACM transactions on audio, speech, and language processing, 2015, 23(1): 46-56.
[9] 刘汾港, 马建芬, 张朝霞. 基于离散余弦变换与Transformer的语音增强[J]. 计算机工程与设计, 2023, 44(6): 1893-1898.
LIU Fengang, MA Jianfen, ZHANG Zhaoxia. Speech enhancement based on discrete cosine transform and Transformer[J]. Computer engineering and design, 2023, 44(6): 1893-1898.
[10] ALLEN J B, RABINER L R. A unified approach to short-time Fourier analysis and synthesis[J]. Proceedings of the IEEE, 1977, 65(11): 1558-1564.
[11] 纪鹏威, 全海燕. 基于双生成器与频域判别器GAN语音增强算法[J]. 云南大学学报(自然科学版), 2024, 46(5): 871-880.
JI Pengwei, QUAN Haiyan. Speech enhancement algorithm based on dual generator and frequency domain discriminator GAN[J]. Journal of Yunnan University (natural sciences edition), 2024, 46(5): 871-880.
[12] 孙奥运, 温培旭, 邵淮先, 等. 高精度音频Sigma-Delta调制器综述[J]. 电子与信息学报, 2024, 46(5): 1874-1887.
SUN Aoyun, WEN Peixu, SHAO Huaixian, et al. A review of high-resolution audio Sigma-Delta modulator[J]. Journal of electronics & information technology, 2024, 46(5): 1874-1887.
[13] 李磊, 朱永同, 杨琦, 等. 基于多任务学习与注意力机制的多层次音频特征情感识别研究[J]. 智能计算机与应用, 2024, 14(1): 85-94, 101.
LI Lei, ZHU Yongtong, YANG Qi, et al. Multilevel emotion recognition of audio features based on multitask learning and attention mechanism[J]. Intelligent computer and applications, 2024, 14(1): 85-94, 101.
[14] 何宇. 电子音乐特征分析和流派分类的研究[D]. 成都: 成都理工大学, 2020.
HE Yu. Research on the characteristic analysis and genre classification of electronic music[D]. Chengdu: Chengdu University of Technology, 2020.
[15] 马丹, 吴跃. 基于生成对抗网络的智能音乐制作综述[J]. 计算机应用研究, 2021, 38(3): 641-646.
MA Dan, WU Yue. Survey of intelligent music creation based on GAN[J]. Application research of computers, 2021, 38(3): 641-646.
[16] 刘杨, 杨飞然, 梁兆杰, 等. 基于卡尔曼滤波的STFT域回声抵消算法[J]. 声学技术, 2022, 41(5): 757-762.
LIU Yang, YANG Feiran, LIANG Zhaojie, et al. Kalman filter based acoustic echo cancellation in the STFT domain[J]. Technical acoustics, 2022, 41(5): 757-762.
[17] WANG Lei, ZHAO Ziyi, LIU Hanwei, et al. A review of intelligent music generation systems[J]. Neural computing and applications, 2024, 36(12): 6381-6401.
[18] LI Fanfan. Chord-based music generation using long short-term memory neural networks in the context of artificial intelligence[J]. The journal of supercomputing, 2024, 80(5): 6068-6092.
[19] 陶雨昂. MFCC特征训练技术在声纹识别中的应用[J]. 集成电路应用, 2024, 41(2): 386-387.
TAO Yuang. Application of MFCC feature training technology in voiceprint recognition[J]. Application of IC, 2024, 41(2): 386-387.
[20] 黄喜阳, 杜庆治, 龙华, 等. 基于MFCC特征融合的语音情感识别算法[J]. 陕西理工大学学报(自然科学版), 2023, 39(4): 17-25.
HUANG Xiyang, DU Qingzhi, LONG Hua, et al. Speech emotion recognition algorithm based on MFCC feature fusion[J]. Journal of Shaanxi University of Technology (natural science edition), 2023, 39(4): 17-25.
[21] 赵扬青, 彭智才, 蒋雨涵, 等. 音频的梅尔频率倒谱系数特征抽取过程[J]. 信息技术与信息化, 2023(1): 104-111.
ZHAO Yangqing, PENG Zhicai, JIANG Yuhan, et al. Mel-frequency cepstral coefficients feature extraction process of audio[J]. Information technology and informatization, 2023(1): 104-111.
[22] 张名武, 李舜酩. 数字信号处理中加窗问题的综述[J]. 工业控制计算机, 2022, 35(6): 119-121, 125.
ZHANG Mingwu, LI Shunming. A survey of windowing in digital signal processing[J]. Industrial control computer, 2022, 35(6): 119-121, 125.
[23] 周海清, 丁岐鹃. 辐射源监测系统频谱检测窗函数选取研究[J]. 长江信息通信, 2021, 34(6): 60-62.
ZHOU Haiqing, DING Qijuan. Study on the selection of spectrum detection window function of radiation source monitoring system[J]. Changjiang information & communications, 2021, 34(6): 60-62.
[24] SINDAL M D, RATNA B. A tale of two leaks-Pachychoroid spectrum[J]. Indian journal of ophthalmology - case reports, 2021, 1(2): 210-211.
[25] NING Fangli, CHENG Zhanghong, MENG Di, et al. Enhanced spectrum convolutional neural architecture: an intelligent leak detection method for gas pipeline[J]. Process safety and environmental protection, 2021, 146: 726-735.
[26] ANTUNES F, FELIX L B. Comparison of signal preprocessing techniques for avoiding spectral leakage in auditory steady-state responses[J]. Research on biomedical engineering, 2019, 35(3): 251-256.
[27] 张胜利, 李伟. 基于窗函数与FFT算法的信号谐波分析[J]. 工业控制计算机, 2019, 32(5): 35-36, 38.
ZHANG Shengli, LI Wei. Signal harmonic analysis based on window functions and FFT algorithms[J]. Industrial control computer, 2019, 32(5): 35-36, 38.
[28] KUWALEK P. The problem of “spectrum leakage” in the measurement of harmonics[J]. ITM web of conferences, 2019, 28: 01044.
[29] 宋立新, 孙东梓, 安佳星, 等. 离散傅里叶变换泄漏及其加窗抑制仿真实验设计[J]. 实验室研究与探索, 2018, 37(7): 106-109.
SONG Lixin, SUN Dongzi, AN Jiaxing, et al. Simulation experiment design of DFT leakage and its windowing suppression[J]. Research and exploration in laboratory, 2018, 37(7): 106-109.
[30] 吴君钦, 王迎福. 一种改进窗函数的低时延语音增强算法[J]. 计算机仿真, 2022, 39(2): 203-211.
WU Junqin, WANG Yingfu. A low-latency speech enhancement algorithm based on improved window function[J]. Computer simulation, 2022, 39(2): 203-211.
[31] 赵晨, 冯丹平, 杨明明, 等. 平面近场声全息中指数滤波器窗函数设计优化[J]. 声学技术, 2021, 40(5): 723-727.
ZHAO Chen, FENG Danping, YANG Mingming, et al. Optimization of window function design of exponential filter in planar near-field acoustic holography[J]. Technical acoustics, 2021, 40(5): 723-727.
[32] 柏果, 程郁凡, 唐万斌. 基于两阶段加窗插值的多音信号频率估计算法[J]. 电子科技大学学报, 2021, 50(5): 682-688.
BAI Guo, CHENG Yufan, TANG Wanbin. Frequency estimation of multi-tone by two-stage windowed interpolation[J]. Journal of University of Electronic Science and Technology of China, 2021, 50(5): 682-688.
相似文献/References:
[1]李德毅.网络时代人工智能研究与发展[J].智能系统学报,2009,4(1):1.
 LI De-yi.AI research and development in the network age[J].CAAI Transactions on Intelligent Systems,2009,4():1.
[2]赵克勤.二元联系数A+Bi的理论基础与基本算法及在人工智能中的应用[J].智能系统学报,2008,3(6):476.
 ZHAO Ke-qin.The theoretical basis and basic algorithm of binary connection A+Bi and its application in AI[J].CAAI Transactions on Intelligent Systems,2008,3():476.
[3]徐玉如,庞永杰,甘?? 永,等.智能水下机器人技术展望[J].智能系统学报,2006,1(1):9.
 XU Yu-ru,PANG Yong-jie,GAN Yong,et al.AUV—state-of-the-art and prospect[J].CAAI Transactions on Intelligent Systems,2006,1():9.
[4]王志良.人工心理与人工情感[J].智能系统学报,2006,1(1):38.
 WANG Zhi-liang.Artificial psychology and artificial emotion[J].CAAI Transactions on Intelligent Systems,2006,1():38.
[5]赵克勤.集对分析的不确定性系统理论在AI中的应用[J].智能系统学报,2006,1(2):16.
 ZHAO Ke-qin.The application of uncertainty systems theory of set pair analysis (SPU)in the artificial intelligence[J].CAAI Transactions on Intelligent Systems,2006,1():16.
[6]秦裕林,朱新民,朱? 丹.Herbert Simon在最后几年里的两个研究方向[J].智能系统学报,2006,1(2):11.
 QIN Yu-lin,ZHU Xin-min,ZHU Dan.Herbert Simons two research directions in his lost years[J].CAAI Transactions on Intelligent Systems,2006,1():11.
[7]谷文祥,李 丽,李丹丹.规划识别的研究及其应用[J].智能系统学报,2007,2(1):1.
 GU Wen-xiang,LI Li,LI Dan-dan.Research and application of plan recognition[J].CAAI Transactions on Intelligent Systems,2007,2():1.
[8]杨春燕,蔡 文.可拓信息-知识-智能形式化体系研究[J].智能系统学报,2007,2(3):8.
 YANG Chun-yan,CAI Wen.A formalized system of extension information-knowledge-intelligence[J].CAAI Transactions on Intelligent Systems,2007,2():8.
[9]赵克勤.SPA的同异反系统理论在人工智能研究中的应用[J].智能系统学报,2007,2(5):20.
 ZHAO Ke-qin.The application of SPAbased identicaldiscrepancycontrary system theory in artificial intelligence research[J].CAAI Transactions on Intelligent Systems,2007,2():20.
[10]王志良,杨?? 溢,杨?? 扬,等.一种周期时变马尔可夫室内位置预测模型[J].智能系统学报,2009,4(6):521.[doi:10.3969/j.issn.1673-4785.2009.06.009]
 WANG Zhi-liang,YANG Yi,YANG Yang,et al.A periodic time-varying Markov model for indoor location prediction[J].CAAI Transactions on Intelligent Systems,2009,4():521.[doi:10.3969/j.issn.1673-4785.2009.06.009]

备注/Memo

收稿日期:2024-5-31。
作者简介:李一熙,硕士研究生,主要研究方向为深度学习和智能音乐生成。E-mail:1941702@tongji.edu.cn。;汪镭,教授,博士生导师,曾任上海市科协委员会委员,曾兼任国际电气与电子工程师学会(IEEE)上海分会副主席主要研究方向为智能控制与智能计算。合作出版专著和译著8本,发表学术论文100余篇。E-mail:wanglei@tongji.edu.cn。;吴启迪,教授,博士生导师,曾任同济大学校长、教育部副部长,主要研究方向为控制理论与应用、计算机集成制造系统及智能自动化理论与应用。荣获国家级、教育部、上海市等科技进步奖多项。出版学术专著10余部,发表学术论文200余篇。E-mail:wuqidi@moe.edu.cn。
通讯作者:汪镭. E-mail:wanglei@tongji.edu.cn

更新日期/Last Update: 1900-01-01
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com