[1]穆妮热·穆合塔尔,李晓,杨雅婷,等.基于词缀的维吾尔谚语识别关键技术研究[J].智能系统学报,2018,13(3):452-457.[doi:10.11992/tis.201706092]
Munire·Muhetaer,LI Xiao,YANG Yating,et al.Affix-based key technology for Uyghur proverb recognition[J].CAAI Transactions on Intelligent Systems,2018,13(3):452-457.[doi:10.11992/tis.201706092]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
13
期数:
2018年第3期
页码:
452-457
栏目:
学术论文—自然语言处理与理解
出版日期:
2018-05-05
- Title:
-
Affix-based key technology for Uyghur proverb recognition
- 作者:
-
穆妮热·穆合塔尔1,2,3, 李晓1,2, 杨雅婷1,2, 艾孜尔古丽4, 周喜1,2
-
1. 中国科学院 新疆理化技术研究所, 新疆 乌鲁木齐 830011;
2. 新疆民族语音语言信息处理实验室, 新疆 乌鲁木齐 830011;
3. 中国科学院大学, 北京 100049;
4. 新疆师范大学 计算机科学技术学院, 新疆 乌鲁木齐 830054
- Author(s):
-
Munire·Muhetaer1,2,3, LI Xiao1,2, YANG Yating1,2, AZRAGUL4, ZHOU Xi1,2
-
1. Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China;
2. Xinjiang Key Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China;
3. University of Chinese Academy of Science, Beijing 100049, China;
4. School of Computer Science and Technology, Xinjiang Normal University, Urumqi 830054, China
-
- 关键词:
-
维吾尔谚语; 谚语词缀; 谚语规则; 词缀覆盖率; 谚语规则库; 谚语语料库; 识别系统
- Keywords:
-
Uyghur proverbs; proverbs affix; proverb rules; coverage rate of affix; proverb rule bases; proverb corpus; recognition system
- 分类号:
-
TP391.1
- DOI:
-
10.11992/tis.201706092
- 摘要:
-
在自然语言理解、机器翻译、舆情分析等自然语言处理领域中,维吾尔谚语识别是整个文本实体识别的重要组成部分。为满足维吾尔谚语信息化的需求,本文构建了比较完善的维吾尔谚语语料库。同时,从传统语言学角度对维吾尔谚语的语法、语义结构进行分析,构建了一个由维吾尔谚语功能语类(词缀)组成的、专属维吾尔谚语规则的知识库,并将此知识库与自然语言处理技术相结合,实现一个既能够从文本中识别出维吾尔谚语,又能提供维汉互译等功能的信息软件系统。该系统也为开展计算机理解与处理维吾尔文字奠定了一个崭新的基础。
- Abstract:
-
In fields of natural language processing such as natural language understanding, machine translation, and public opinion analysis, Uyghur proverb recognition is an important part of the whole text entity recognition. To meet the need of Uyghur proverb informationization, this paper establishes a relatively complete corpus of Uyghur proverbs. The grammar and semantic structure of Uygur proverbs were analyzed from the perspective of traditional linguistics, and a knowledge base that comprises functional genres (affixes) of Uyghur proverbs and obeys Uyghur proverb rules was constructed. In addition, the knowledge base was combined with natural language processing technologies to realize an information software system that can recognize Uyghur proverbs from text and mutually translate between Chinese and Uyghur language. The system also laid a new foundation for understanding and processing Uyghur language and characters by computer.
备注/Memo
收稿日期:2017-06-30。
基金项目:新疆维吾尔自治区重点实验室开放课题(2015KL031);新疆维吾尔自治区重大科技专项课题(2016A03007-3);新疆维吾尔自治区自然科学基金项目(2015211B034);中科院战略性先导科技专项项目(XDA06030400);新疆维吾尔自治区社会科学基金项目(2016CYY067).
作者简介:穆妮热·穆合塔尔,女,1989年生,博士,主要研究方向为多种语言信息处理、机器翻译、自然语言处理;李晓,男,1957年生,研究员,主要研究方向为多语种信息处理、人工智能。主持或承担过多项国家863、中科院战略先导项目。发表学术论文60余篇;杨雅婷,女,1985年生,副研究员,博士,主要研究方向为机器翻译和自然语言处理。承担过多项国家863、中科院战略先导项目。发表学术论文30余篇。
通讯作者:李晓.E-mail:xiaoli@ms.xjb.ac.cn.
更新日期/Last Update:
2018-06-25