[1]张涛,贾真,李天瑞,等.基于知识库的开放领域问答系统[J].智能系统学报,2018,13(04):557-563.[doi:10.11992/tis.201707039]
 ZHANG Tao,JIA Zhen,LI Tianrui,et al.Open-domain question-answering system based on large-scale knowledge base[J].CAAI Transactions on Intelligent Systems,2018,13(04):557-563.[doi:10.11992/tis.201707039]
点击复制

基于知识库的开放领域问答系统(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第13卷
期数:
2018年04期
页码:
557-563
栏目:
出版日期:
2018-07-05

文章信息/Info

Title:
Open-domain question-answering system based on large-scale knowledge base
作者:
张涛 贾真 李天瑞 黄雁勇
西南交通大学 信息科学与技术学院, 四川 成都 611756
Author(s):
ZHANG Tao JIA Zhen LI Tianrui HUANG Yanyong
School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611756, China
关键词:
问答系统开放领域实体识别实体链接知识库
Keywords:
question-answering systemopen domainentity recognitionentity linkingknowledge base
分类号:
TP391.1
DOI:
10.11992/tis.201707039
摘要:
问答系统能够理解用户问题,并直接返回答案。现有问答系统大多是面向领域的,仅能回答特定领域的问题。文中提出了基于大规模知识库的开放领域问答系统实现方法。该系统首先采用自定义词典分词和CRF模型相结合的方法识别问句中的主体;其次,采用模糊匹配方法将问句中的主体与知识库中实体建立链接;然后,通过相似度计算以及规则匹配等多种方法识别问句中的谓词并与知识库实体的属性建立关联;最后,进行实体消歧和答案获取。该系统平均F-Measure值为0.695 6,表明所提方法在基于知识库的开放领域问答上具有可行性。
Abstract:
Question-answering (QA) systems can understand user questions and return answers directly. Currently, most QA systems can only answer questions pertaining to specific domains. In this paper, we propose a method for constructing an open-domain QA system based on a large-scale knowledge base. First, we present an approach based on a visual dictionary and a conditional random field (CRF) model to identify the subject in question. Next, we use a fuzzy matching method to link the entity in question to that in the knowledge base, and apply similarity computation and rule matching methods to recognize the question predicates and link them to the attributes of the knowledge entity. Lastly, we implement entity disambiguation and answer retrieval. The mean F-measure value of the proposed system is 0.695 6, which indicates the feasibility of the proposed method for an open-domain QA system for a large-scale knowledge base.

参考文献/References:

[1] MOONEY R J. Learning for semantic parsing[C]//Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing. Berlin, Heidelberg:Springer, 2007:311-324.
[2] FILMAN R E, PANT S. Searching the internet[J]. IEEE internet computing, 1998, 2(4):21-23.
[3] JEON J, CROFT W B, LEE J H. Finding similar questions in large question and answer archives[C]//Proceedings of the 14th ACM International Conference on Information and Knowledge Management. New York, NY, USA:ACM, 2005:84-90.
[4] ZETTLEMOYER L S, COLLINS M. Learning to map sentences to logical form:structured classification with probabilistic categorial grammars[C]//Proceedings of the 21th Conference on Uncertainty in Artificial Intelligence. Arlington, Virginia, USA:AUAI, 2005:658-666.
[5] WONG Y W, MOONEY R J. Learning for semantic parsing with statistical machine translation[C]//Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. Stroudsburg, PA, USA:The Association for Computational Linguistics, 2006:439-446.
[6] WONG Y W, MOONEY R J. Generation by inverting a semantic parser that uses statistical machine translation[C]//Proceeding of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. Rochester, New York, USA:The Association for Computational Linguistics, 2007:172-179.
[7] ZETTLEMOYER L S, COLLINS M. Online learning of relaxed CCG grammars for parsing to logical form[C]//Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Prague, Czech Republic:The Association for Computational Linguistics, 2007:678-687.
[8] KWIATKOWSKI T, ZETTLEMOYER L, GOLDWATER S, et al. Lexical generalization in CCG grammar induction for semantic parsing[C]//Proceedings of 2011 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA:The Association for Computational Linguistics, 2011:1512-1523.
[9] KWIATKOWSKI T, ZETTLEMOYER L, GOLDWATER S, et al. Inducing probabilistic CCG grammars from logical form with higher-order unification[C]//Proceedings of 2010 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA:The Association for Computational Linguistics, 2010:1223-1233.
[10] YE Zhonglin, JIA Zheng, YANG Yan, et al. Research on open domain question answering system[C]//LI Juanzi, JI Heng, ZHAO Dongyan, et al. Proceedings of the 4th International Conference on Natural Language Processing and Chinese Computing (NLPCC2015). Cham, Germany:Springer, 2015:527-540.
[11] POON H, DOMINGOS P. Unsupervised semantic parsing[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA:The Association for Computational Linguistics, 2009:1-10.
[12] YAHYA M, BERBERICH K, ELBASSUONI S, et al. Natural language questions for the web of data[C]//Proceedings of 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg, PA, USA:ACL, 2012:379-390.
[13] YAO Xuchen, BERANT J, VAN DURME B. Freebase QA:information extraction or semantic parsing[C]//Proceedings of the ACL 2014 Workshop on Semantic Parsing. Baltimore, Maryland USA:The Association for Computational Linguistics, 2014:82-86.
[14] BERANT J, CHOU A, FROSTIG R, et al. Semantic parsing on freebase from question-answer pairs[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA:Association for Computational Linguistics, 2013:1533-1544.
[15] BERANT J, LIANG P. Semantic parsing via paraphrasing[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, Maryland, USA:Association for Computational Linguistics, 2014:1415-1425.
[16] ZHENG Zhiping. AnswerBus question answering system[C]//Proceedings of the Second International Conference on Human Language Technology Research. San Francisco, CA, USA:Morgan Kaufmann Publishers Inc, 2002:399-404.
[17] LIU F Y, LIN G S, SHEN C H. CRF learning with CNN features for image segmentation[J]. Pattern recognition, 2015, 48(10):2983-2992.
[18] GANAPATHY S, VIJAYAKUMAR P, YOGESH P, et al. An intelligent CRF based feature selection for effective intrusion detection[J]. International Arab journal of information technology, 2016, 13(1):44-50.
[19] 夏天. 汉语词语语义相似度计算研究[J]. 计算机工程, 2007, 33(6):191-194. XIA Tian. Study on Chinese words semantic similarity computation[J]. Computer engineering, 2007, 33(6):191-194.
[20] WU Yunfang, LI Wei. Overview of the NLPCC-ICCPOL 2016 shared task:Chinese word similarity measurement[C]//Natural Language Understanding and Intelligent Applications. Cham, Germany:Springer, 2016:828-839.

备注/Memo

备注/Memo:
收稿日期:2017-07-25。
基金项目:国家自然科学基金项目(61573292);国家自然科学基金青年科学基金项目(61603313).
作者简介:张涛,男,1989年生,硕士研究生,主要研究方向为中文信息处理、信息抽取、智能问答;贾真,女,1975年生,讲师,博士,主要研究方向为自然语言理解、中文信息处理、信息抽取、大数据;李天瑞,男,1969年生,教授,博士生导师,博士,主要研究方向为智能信息处理、数据挖掘、云计算和大数据。
通讯作者:张涛.E-mail:tzhangswjtu@163.com.
更新日期/Last Update: 2018-08-25