[1]LI Yixi,WANG Lei,XUE Yu,et al.Research on window function analysis in STFT-based intelligent music generation system[J].CAAI Transactions on Intelligent Systems,2025,20(3):750-760.[doi:10.11992/tis.202405043]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
20
Number of periods:
2025 3
Page number:
750-760
Column:
人工智能校长论坛
Public date:
2025-05-05
- Title:
-
Research on window function analysis in STFT-based intelligent music generation system
- Author(s):
-
LI Yixi1; WANG Lei1; XUE Yu2; WU Qidi1
-
1. College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China;
2. Taizhou High School, Taizhou 225300, China
-
- Keywords:
-
STFT; artificial intelligence; music generation; window function; MFCC; spectrum leakage; main-lobe gain; functions of mixing
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202405043
- Abstract:
-
In an intelligent music generation system based on short-time Fourier transform (STFT), the introduction of Mel frequency cepstral coefficients as input features, coupled with an optimized design of the STFT loss function, enhances the quality of music generation. During the STFT of the note input signal, the time-domain signal needs to be truncated, and the window functions must be added. Adding a time-domain window to the signal is equivalent to performing convolution in the frequency domain. Truncating the time-domain signal introduces spectral analysis errors, causing the spectrum to spread on both sides centered around the actual frequency value in the shape of the window function’s spectral waveform, leading to spectral leakage. The selection of different window functions has a significant impact on the quality of the final generated music. On this basis, a window function analysis and selection method based on the energy correction factor, the maximum sidelobe, and the main lobe gain in the frequency domain is proposed, and the corresponding script tools are developed to complete the design of a mixed window function based on music in the symbol domain. Experimental results show that the mixed window function can effectively reduce the impact of spectral leakage on the signal truncation on different MIDI datasets, and has good adaptability and flexibility, so as to better act on the intelligent music generation system based on STFT.