<-上一篇/Previous Article 下一篇/Next Article->

[1]沈映泉,刘勇进,蔡骏,等.利用人类计算技术的语音语料库标注方法及其实现[J].智能系统学报,2009,4(3):270-277.
　SHEN Ying-quan,LIU Yong-jin,CAI Jun,et al.Method and implementation of transcribing speech corpora based on humancomputation[J].CAAI Transactions on Intelligent Systems,2009,4(3):270-277.

点击复制

利用人类计算技术的语音语料库标注方法及其实现

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 4 期数: 2009年第3期页码: 270-277 栏目: 学术论文—自然语言处理与理解出版日期: 2009-06-25

Title:: Method and implementation of transcribing speech corpora based on humancomputation

文章编号:: 1673-4785(2009)03-0270-08

作者:: 沈映泉¹，刘勇进¹，蔡骏^1，2，史晓东^1; 1.厦门大学智能科学与技术系，福建厦门 361005；
2.Groupe Parole, LORIACNRS & INRIA, BP 239, 54600 VandoeuvrelesNancy, France

Author(s):: SHEN Ying-quan¹, LIU Yong-jin^1, CAI Jun^1,2, SHI Xiao-dong^1; 1.Department of Cognitive Science， Xiamen University, Xiamen 361005，China;
2.Groupe Parole, LORIACNRS & INRIA, BP 239, 54600 VandoeuvrelesNancy, France

关键词:: 语音语料库标注; 人类计算; 分布式知识获取; 基于Web的语言学习

Keywords:: speech corpora transcription; humancomputation; distributed knowledge acquisition; Webbased language learning

分类号:: TP39

文献标志码:: A

摘要:: 提出一种基于人类计算的语音语料库标注方法.该标注方法的主要思路是通过一个基于Web的语言学习系统来收集由大量学习者（用户）输入的词汇标注和音标标注，并从中选择出现概率最大的用户输入作为语料的正确标注.为了保证通过这种人类计算方法获得的标注文本的质量，使用了一些计算机辅助机制来校验收集到的标注的可靠性.采用这种方法实现语音语料库标注的主要优点在于将语料库标注和语言学习相结合，无需专门投入大量的人力来进行枯燥乏味的语料库标注工作，从而节省了语料库标注的成本.对这种基于人类计算的语音语料库标注技术进行了探讨，说明了用于收集用户输入的语言学习系统的设计以及标注生成系统的设计.系统的应用表明，该标注方法能够有效、低成本地生成语音语料库的词汇标注和音标标注.

Abstract:: A new method is proposed for generating transcriptions of speech corpora based on humancomputation. The method depends on collection of orthographic transcriptions and phonetic transcriptions from a large number of users by using a Webbased language learning system and choosing commonlyused labels as the transcriptions of the speech corpora. In order to guarantee the quality of transcriptions, some computeraided mechanisms are also used to verify the collected transcriptions. This method combines speech data transcribing with language learning and cuts down the cost of transcribing corpora effectively. The technology of humancomputationbased speech corpora transcribing and the detailed design of language learning system have been discussed, transcriptions generation system has also been expatiated in this article. The application of system shows that this method is an effective and economical way to generate orthographic and phonetic transcriptions.

参考文献/References:: ［1］AHN L von, DABBISH L. Labeling images with a computer game［C］//Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Vienna, Austria, 2004: 319326.
［2］BIRD S, LIBEMAN M. A formal framework for linguistic annotation［J］. Speech Communication, 2001, 33(1/2): 2360
［3］YOUNG S J, EVENMANN G, GALES M, et al. The HTK book (for HTK Version 3.4)［EB/OL］. ［20080620］. http:/
?[4]DEMUYNCK K, LAUREYS T, GILLIS S. Automatic generation of phonetic transcriptions for large speech corpora［C］//Proceedings of the 7th International Conference on Spoken Language Processing. Denver, USA, 2002：333336.
［5］SCHIEL F. Automatic phonetic transcription of nonprompted speech［C］//Proceedings of 1999 International Conference of Phonetic Sciences. San Francisco, USA, 1999： 607610.
［6］CHANG S, SHASTRI L, GREENBERG S. Automatic phonetic transcription of spontaneous speech (American English)［C］//Proceedings of the 6th International Conference on Spoken Language Processing. Beijing, 2000, 4: 330333. 
［7］CHEN S S, EIDE E, GALES M J F, et al. Automatic transcription of broadcast news［J］. Speech Communication, 2002, 37 (1/2): 6987.
［8］CHAN H Y, WOODLAND P. Improving broadcast news transcription by lightly supervised discriminative training［C］//Proceedings of 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing. Montreal, Canada, 2004, 1: 737740.
?［9]KATO K, NANJO H, KAWAHARA T. Automatic transcription of lecture speech using topicindependent language modeling［C］//Proceedings of the Sixth International Conference on Spoken Language Processing. Beijing, China, 2000： 162165.
?［10］BACCHIANI M. Automatic transcription of voicemail at AT&T［C］//Proceedings of 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Salt Lake City, USA, 2001, 1: 2528
［11］HAIN T, BURGET L, DINES J, et al. The 2005 AMI system for the transcription of speech in meetings［J］. Lecture Notes in Computer Science, 2006, 3869: 450462.
［12］KOSORUKOFF A. Humanbased genetic algorithm［J］ IEEE Transactions on Systems, Man, and Cybernetics, 2001, 5: 34643469
［13］AHN L von. Human computation［EB/OL］. (20060726)［20080620］. http://video.google.com/videoplay?docid=8246463980976635143.
［14］SINGH P, LIN T, MUELLER E T, et al. Open mind common sense: knowledge acquisition from the general public［J］. Lecture Notes in Computer Science, 2002, 2519:12231237
［15］GENTRY C,RAMZAN Z，STUBBLEBINE S. Secure distributed human computation［C］//Proceedings of the 6th ACM Conference on Electronic Commerce.New York, USA: ACM, 2005: 155164
［16］AHN L von, KEDIA M, BLUM M. Verbosity: a game for collecting commonsense facts［C］//Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. New York, USA: ACM, 2006: 7578.
[17］AMAZON.COM, Inc.. Amazon mechanical turk［EB/OL］.【20080620］. http://www.mturk.com.
［18］DONG Enqing,LIU Guizhong，ZHOU Yatong，et al. Voice activity detection based on shorttime energy and noise spectrum adaptation［C］//Proceedings of the 6th International Conference on Signal Processing.Beijing,China,2002, 1: 464467.
［19］BUNKE H. On a relation between graph edit distance and maximum common subgraph［J］.Pattern Recognition Letters, 1997, 18(9): 689694.

备注/Memo

收稿日期：2008-07-02.
基金项目：国家留学基金资助项目（2006104705）；福建省自然科学基金资助项目（2006J0043）；厦门大学“985工程”二期信息创新平台资助项目（0000X07204）.
通信作者：蔡骏.E-mail:Jun.Cai@ulb.ac.be, Jun.Cai@loria.fr.
作者简介：沈映泉，男，1984年生，硕士研究生，主要研究方向为语音情感识别、自然语言处理.
刘勇进，男，1984年生，硕士研究生，主要研究方向为自然语言处理. 
?蔡骏，男，1966年出生，副教授，博士.布鲁塞尔自由大学（ULB）图像、信号和远程通信实验室研究员.IEEE Computer Society、IEEE Signal Processing Society会员，International Speech Communication Association会员.主要研究方向为自动话语识别、计算机语音处理，在自动话语识别的实时计算和人类语音的Articulatory Modeling等方面进行了深入的研究.参加与主持科研项目20项，发表学术论文30余篇.

更新日期/Last Update: 2009-08-31

利用人类计算技术的语音语料库标注方法及其实现 PDF下载HTML

备注/Memo

利用人类计算技术的语音语料库标注方法及其实现

PDF下载 HTML