[1]SHEN Ying-quan,LIU Yong-jin,CAI Jun,et al.Method and implementation of transcribing speech corpora based on humancomputation[J].CAAI Transactions on Intelligent Systems,2009,4(3):270-277.
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
4
Number of periods:
2009 3
Page number:
270-277
Column:
学术论文—自然语言处理与理解
Public date:
2009-06-25
- Title:
-
Method and implementation of transcribing speech corpora based on humancomputation
- Author(s):
-
SHEN Ying-quan1; LIU Yong-jin1; CAI Jun1; 2; SHI Xiao-dong1
-
1.Department of Cognitive Science, Xiamen University, Xiamen 361005,China;
2.Groupe Parole, LORIACNRS & INRIA, BP 239, 54600 VandoeuvrelesNancy, France
-
- Keywords:
-
speech corpora transcription; humancomputation; distributed knowledge acquisition; Webbased language learning
- CLC:
-
TP39
- DOI:
-
-
- Abstract:
-
A new method is proposed for generating transcriptions of speech corpora based on humancomputation. The method depends on collection of orthographic transcriptions and phonetic transcriptions from a large number of users by using a Webbased language learning system and choosing commonlyused labels as the transcriptions of the speech corpora. In order to guarantee the quality of transcriptions, some computeraided mechanisms are also used to verify the collected transcriptions. This method combines speech data transcribing with language learning and cuts down the cost of transcribing corpora effectively. The technology of humancomputationbased speech corpora transcribing and the detailed design of language learning system have been discussed, transcriptions generation system has also been expatiated in this article. The application of system shows that this method is an effective and economical way to generate orthographic and phonetic transcriptions.