<-Previous Article Next Article->

[1]LUO Yuan,TONG Kaiguo,ZHANG Yi,et al.Sound source separation of a multi voice environment based on human ear listening properties[J].CAAI Transactions on Intelligent Systems,2012,7(2):121-128.

Copy

Sound source separation of a multi voice environment based on human ear listening properties

PDF Download HTML

CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume: 7 Number of periods: 2012 2 Page number: 121-128 Column: 学术论文—机器感知与模式识别 Public date: 2012-04-25

Title:: Sound source separation of a multi voice environment based on human ear listening properties

Author(s):: LUO Yuan; TONG Kaiguo; ZHANG Yi; XING Wuchao; CHEN Kai; CHEN Hongsong; HE Chunjiang; CHEN Jun; Research Center of Intelligent System and Robot, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

Keywords:: multivoice source environment; human ear listening properties; interaural time difference; interaural level difference; sound source separation

CLC:: TP311

DOI:: -

Abstract:: Inspired by acoustics, an integrated voice separation model simulating the central auditory system was established to process a voice by imitating the listening properties of human ears. First, multispectral analysis of voice signals was carried out by a peripheral auditory model. Next, a coincidence neuron model was established to extract the features of voice signals. Last, the voices were separated in the cell model of the brain inferior colliculus. Compared to the majority of speech recognition models that can only be used in a single sound source and lownoise environment, this model is a good choice. Experimental results show that the model can separate voices in a multisound source environment, thus having a high robustness. With further research, speech separation models based on human ear listening properties will have a wide range of applications. 

References:: ［1］OZEROV A, VINCENT E, BIMBOT F. A general modular framework for audio source separation［C］//9th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA’10). SaintMalo, France, 2010: 3340.
［2］VINCENT E, BERTIN N, BADEAU R. Harmonic and inharmonic on negative matrix factorization for polyphonic pitch transcription［C］//Proc of IEEE International Conference on Acoustics, Speech, and Signal Processing. Rennes Cedex, France, 2008: 109112.
［3］FITZGERALD D, GAINZA M. Single channel vocal separation using median filtering and factorization techniques［J］. ISAST Transactions on Electronic and Signal Processing, 2010, 4(1): 6273.
［4］赵鹤鸣，葛良，陈雪勤，等. 基于声音定位和听觉掩蔽效应的语音分离研究［J］. 半导体学报， 2005, 33(1): 158160.
ZHAO Heming, GE Liang, CHEN Xueqin, et al. Research based on sound localization and auditory masking effect of voice separation［J］.Journal of Semiconductors, 2005, 33(1): 158160.
［5］LIU Jindong, ERWIN H, WERMTER S. Mobile robot broadband sound localisation using a biologically inspired spiking neural network［C］//Proceedings of IEEE/RSJ Int Conf on Intelligent Robots and Systems in Nice. ［S.l.］, 2008: 21912196.
［6］DURRIEU J L, RICHARD G, DAVID B. An iterative approach to monaural musical mixture desoloing［C］//Proc of IEEE International Conference on Acoustics, Speech, and Signal Processing. Paris, France, 2009: 105108.
［7］KONIARIS C, CHATTERJEE S, KLEIJN W B. Towards effective singing voice extraction from stereophonic recordings［C］//2010 IEEE International Conference on Acoustics Speech and Signal Processing(ICASSP). Hatfield, UK, 2010: 233236.
［8］BROWN G J, FERRY R T, MEDDIS R. A computer model of auditory efferent suppression: implications for the recognition of speech in noise［J］. Acoustical Society of America, 2010, 127(2): 943954. 
［9］DUONG N, VINCENT E, GRIBONVAL R. Spatial covariance models for underdetermined reverberant audio source separation［C］//Applications of Signal Processing to Audio and Acoustics 2009 (WASPAA’09). Rennes, France, 2009: 129132.
［10］DONG Yi, MIHALAS S, NIEBUR E. Improved integral equation solution for the first passage time of leaky integrateandfire neurons［J］. Neural Computation, 2011, 23(2): 421434.
［11］VOUTSAS K, ADAMY J. A biologically inspired spiking neural network for sound source lateralization［J］. IEEE Trans Neural Networks, 2007, 18(6): 17851799.

Similar References:

Memo

Last Update: 2012-07-12

Sound source separation of a multi voice environment based on human ear listening properties PDF DownloadHTML

Memo

Sound source separation of a multi voice environment based on human ear listening properties

PDF Download HTML