[1]FANG Peng,LI Xian,WANG Zengfu.Conversion of singing voice based on kernel clustering and partial least squares regression[J].CAAI Transactions on Intelligent Systems,2016,11(1):55-60.[doi:10.11992/tis.201506022]
Copy

Conversion of singing voice based on kernel clustering and partial least squares regression

References:
[1] VILLAVICENCIO F, BONADA J. Applying voice conversion to concatenative singing-voice synthesis[C]//Proceedings of Interspeech. Chiba, Japan, 2010:2162-2165.
[2] ABE M, NAKAMURA S, SHIKANO K, et al. Voice conversion through vector quantization[J]. Journal of the acoustical society japan (E), 1990, 11(2):71-76.
[3] KAIN A, MACON M W. Spectral voice conversion for text-to-speech synthesis[C]//Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing. Seattle, WA, USA, 1998, 1:285-288.
[4] STYLIANOU Y, CAPPE, O, MOULINES E. Continuous probabilistic transform for voice conversion[J]. IEEE transactions on speech and audio processing, 1998, 6(2):131-142.
[5] TODA T, BLACK A W, TOKUDA K. Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory[J]. IEEE transactions on audio, speech, and language processing, 2007, 15(8):2222-2235.
[6] HELANDER E, VIRTANEN T, NURMINEN J, et al. Voice conversion using partial least squares regression[J]. IEEE transactions on audio, speech, and language processing, 2010, 18(5):912-921.
[7] LIU Lijuan, CHEN Linghui, LING Zhenhua, et al. Using bidirectional associative memories for joint spectral envelope modeling in voice conversion[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Florence, Italy, 2014:7884-7888.
[8] CHEN Linghui, LING Zhenhua, LIU Lijuan, et al. Voice conversion using deep neural networks with layer-wise generative training[J]. IEEE/ACM Transactions on audio, speech, and language processing, 2014, 22(12):1859-1872.
[9] DESAI S, BLACK A W, YEGNANARAYANA B, et al. Spectral mapping using artificial neural networks for voice conversion[J]. IEEE transactions on audio, speech, and language processing, 2010, 18(5):954-964.
[10] KOBAYASHI K, TODA T, NEUBIG G, et al. Statistical singing voice conversion with direct waveform modification based on the spectrum differential[C]//Proceedings of Interspeech. Singapore, 2014.
[11] KAWAHARA H, MORISE M, TAKAHASHI T, et al. Tandem-STRAIGHT:A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. Las Vegas, NV, USA, 2008:3933-3936.
[12] WU Zhongdong, XIE Weixin, YU Jianping. Fuzzy C-means clustering algorithm based on kernel method[C]//Proceedings of the 5th International Conference on Computational Intelligence and Multimedia Applications. ICCIMA. Xi’an, China, 2003:49-54.
[13] GRAVES D, PEDRYCZ W. Kernel-based fuzzy clustering and fuzzy clustering:a comparative experimental study[J]. Fuzzy Sets Systems, 2010, 161(4):522-543.
[14] DE JONG S. SIMPLS:An alternative approach to partial least squares regression[J]. Chemometrics and intelligent laboratory systems, 1993, 18(3):251-263.
[15] IMAI S, SUMITA K, FURUICHI C. Mel log spectrum approximation (MLSA) filter for speech synthesis[J]. Electronics and communications in Japan (Part I:Communications), 1983, 66(2):10-18.
Similar References:

Memo

-

Last Update: 1900-01-01

Copyright © CAAI Transactions on Intelligent Systems