[1]罗元,童开国,张毅,等.多个声源下基于人耳听觉特性的语音分离[J].智能系统学报,2012,7(2):121-128.
LUO Yuan,TONG Kaiguo,ZHANG Yi,et al.Sound source separation of a multi voice environment based on human ear listening properties[J].CAAI Transactions on Intelligent Systems,2012,7(2):121-128.
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
7
期数:
2012年第2期
页码:
121-128
栏目:
学术论文—机器感知与模式识别
出版日期:
2012-04-25
- Title:
-
Sound source separation of a multi voice environment based on human ear listening properties
- 文章编号:
-
1673-4785(2012)02-0121-08
- 作者:
-
罗元,童开国,张毅,邢武超,陈凯,陈红松,何春江,陈君
-
重庆邮电大学 智能系统及机器人研究所,重庆 400065
- Author(s):
-
LUO Yuan, TONG Kaiguo, ZHANG Yi, XING Wuchao, CHEN Kai,CHEN Hongsong, HE Chunjiang, CHEN Jun
-
Research Center of Intelligent System and Robot, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
-
- 关键词:
-
多声源; 人耳听觉特性; 双耳时间差; 双耳水平差; 语音分离
- Keywords:
-
multivoice source environment; human ear listening properties; interaural time difference; interaural level difference; sound source separation
- 分类号:
-
TP311
- 文献标志码:
-
A
- 摘要:
-
受声学研究启发,结合人脑人耳听觉特性对语音的处理方式,建立了一个完整的模拟听觉中枢系统的语音分离模型.首先利用外周听觉模型对语音信号进行多频谱分析,然后建立重合神经元模型提取语音信号的特征,最后在脑下丘的神经细胞模型中完成对语音的分离.基于现有的语音识别方法,该模型能够很好地解决绝大多数的语音识别方法都只能在单声源和低噪声的环境下使用的问题.实验结果表明,该模型能够实现多声源环境下语音的分离并且具有较高的鲁棒性.随着研究的深入,基于人耳听觉特性的语音分离模型将有很广泛的应用前景.
- Abstract:
-
Inspired by acoustics, an integrated voice separation model simulating the central auditory system was established to process a voice by imitating the listening properties of human ears. First, multispectral analysis of voice signals was carried out by a peripheral auditory model. Next, a coincidence neuron model was established to extract the features of voice signals. Last, the voices were separated in the cell model of the brain inferior colliculus. Compared to the majority of speech recognition models that can only be used in a single sound source and lownoise environment, this model is a good choice. Experimental results show that the model can separate voices in a multisound source environment, thus having a high robustness. With further research, speech separation models based on human ear listening properties will have a wide range of applications.
备注/Memo
收稿日期: 2011-09-28.
基金项目:科技部国际合作资助项目(2010DF12160);重庆市攻关计划资助项目(CSTC:2010AA2055).
通信作者:童开国.????????????E-mail:359018647@qq.com.
作者简介:
罗元,女,1972年生,教授,博士.近年来参与和负责了包括科技部国际合作项目、教育部留学回国人员项目、重庆市科研项目等多项国家级、省部级项目.主要研究方向为机器视觉、人机交互、基于图像视频处理的测试.近年来发表学术论文60余篇,其中20余篇被SCI、EI检索,获得国家发明专利3项.
童开国,男,1985年生,硕士研究生,主要研究方向为语音识别与智能机器人,发表学术论文4篇.
张毅,男,1966年生,教授,博士生导师,博士后,近年来承担了科技部国际合作项目、人事部留学人员科技活动项目择优资助重点项目以及重庆市科技攻关项目“轮椅式机器人导航与控制系统研发”课题;国际期刊International Journal of Modelling, Identification and Control、International Journal of Automation and Computing和International Journal of Advanced Mechatronic Systems关于智能系统及机器人专刊的编委.
更新日期/Last Update:
2012-07-12