[1]张 磊,陈 晶,项学智,等.结合关键词混淆网络的关键词检出系统[J].智能系统学报,2010,5(05):432-435.[doi:10.3969/j.issn.1673-4785.2010.05.009]
 ZHANG Lei,CHEN Jing,XIANG Xue-zhi,et al.Research of keyword spotting based on a keyword spotting confusion network[J].CAAI Transactions on Intelligent Systems,2010,5(05):432-435.[doi:10.3969/j.issn.1673-4785.2010.05.009]
点击复制

结合关键词混淆网络的关键词检出系统(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第5卷
期数:
2010年05期
页码:
432-435
栏目:
出版日期:
2010-10-25

文章信息/Info

Title:
Research of keyword spotting based on a keyword spotting confusion network
文章编号:
1673-4785(2010)05-0432-04
作者:
张   磊 陈   晶项学智贾梅梅
哈尔滨工程大学 信息与通信工程学院,黑龙江 哈尔滨 150001
Author(s):
ZHANG Lei CHEN Jing XIANG Xue-zhi JIA Mei-mei
College of Information and Communication Engineering, Harbin Engineering University, Harbin 150001, China
关键词:
关键词检出混淆网络语音识别
Keywords:
keyword spotting confusion network speech recognition
分类号:
TP391;TN912
DOI:
10.3969/j.issn.1673-4785.2010.05.009
文献标志码:
A
摘要:
为了高效地从大词汇量连续语音识别(LVCSR)的多候选中得到关键词结果,保证最小词错误率,提出了将混淆网络的思想应用到关键词检出系统中.在传统混淆网络生成方法基础上,提出一种改进的更加适合于关键词检出的关键词混淆网络作为关键词检出的中间结构,该方法只对所有关键词竞争候选生成带有得分标记的关键词混淆网络,突出候选之间竞争关系,并根据得分标记确定关键词.与传统的Nbest作为中间结构的关键词检出系统比较,基于混淆网络的关键词检出系统的召回率为87.11%,提高了21.65%.实验表明,在提高召回率的同时,所提方法具有关键词直接定位的特点,因此具有较低的时间开销.
Abstract:
In order to achieve a higher keyword recall rate from large vocabulary continuous speech recognition (LVCSR) and minimize the word error rate, a confusion network was used in a keyword spotting system. Moreover, an improved method of generating a keyword confusion network which was more suitable for keyword spotting was proposed based on the traditional algorithm. This method only focused on keyword competitions, and was capable of transforming all the keyword competitions into a confusion network with a marked score, and highlighted competitions to all the candidates. Compared with the traditional keyword spotting system which uses N-best as the medium structure, the proposed method increased the recall rate of confusion network to 87.11%; compared with the keyword spotting system based on N-best, there is a 21.65% improvement in the recall rate. Experiments show the proposed method could locate keywords directly, besides increasing the recall rate, so the system costs less time.

参考文献/References:

[1]叶靓,王智斌,邵谦明.基于相关反馈的语音检索引擎[J].计算机工程, 2007, 33(17): 228-230.
YE Liang, WANG Zhibin, SHAO Qianming. Speech retrieval engine based on relevance feedback[J]. Computer Engineering, 2007, 33(17): 228-230.
[2]王让定,袁旭海,徐霁.一种新颖的混合语音检索算法[J].计算机应用研究, 2008, 25(5): 1349-1351.
WANG Rangding, YUAN Xuhai, XU Ji. Novel mixing speech retrieval algorithm[J]. Application Research of Computers, 2008, 25(5): 1349-1351.
[3]陈立伟,宋宪晨,章东华,等.一种基于优化神经网络的语音识别[J].应用科技, 2008, 35(2): 17-20.
 CHEN Liwei,SONG Xianchen,ZHANG Dongsheng,et al.Speech recognition using an optimized wavelet neural network[J]. Applied Science and Technology, 2008, 35(2):17-20.
[4]郑铁然,韩纪庆. 汉语语音检索中基于音节的索引方法研究[C]//第八届全国人机语音通讯学术会议论文集. 北京, 中国, 2005: 419-424. 
ZHENG Tieran, HAN Jiqing. Study on syllable based indexing methods in mandarin speech retrieval[C]//Proceedings of National Conference on Man-Machine Speech Communication. Beijing, China, 2005: 419-424.
[5]MANGU L, BRILL E, STOLCKE A. Finding consensus in speech recognition: word error minimization and other applications of confusion networks[J]. Computer Speech and Language, 2000, 14(4): 373-400.
[6]XUE Jian, ZHAO Yunxin. Improved confusion network algorithm and shortest path search from word Lattice[C]//Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Philadelphia, USA, 2005: 853-856.
[7]ZHANG Pengyuan, SHAO Jian, ZHAO Qingwei, et al. Keyword spotting based on syllable confusion network [C]//The Third International Conference on Natural Computing. Haikou, China, 2007: 656-659.
[8]YONG S, EVERMANN G, GALES M. The HTK book(for HTK 3.3)[EB/OL]. [2009-11-25]. Http: //htk.eng.cam.ac.uk.
[9]GOODMAN J T. A bit of progress in language modeling[J]. Computer Speech and Language, 2001, 15(4): 403-434.

备注/Memo

备注/Memo:
收稿日期:2009-12-03.
基金项目:国家自然科学基金资助项目(60702053);黑龙江省青年骨干教师支持计划资助项目(1155G17).
通信作者:张   磊.E-mail:zhanglei@hrbeu.edu.cn.
作者简介:
张   磊,女,1973年生,副教授,主要研究方向为语音信号处理,承担或参与4项国家自然科学基金项目,发表学术论文30余篇.
陈   晶,女,1984年生,硕士研究生,主要研究方向为语音信号处理.
项学智,男,1979年生,讲师、博士,主要研究方向为信号处理,参与多项国家自然科学基金项目,发表学术论文20余篇.
更新日期/Last Update: 2010-11-26