[1]黄河燕,刘啸.面向新领域的事件抽取研究综述[J].智能系统学报,2022,17(1):201-212.[doi:10.11992/tis.202109045]
HUANG Heyan,LIU Xiao.A survey on event extraction in new domains[J].CAAI Transactions on Intelligent Systems,2022,17(1):201-212.[doi:10.11992/tis.202109045]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
17
期数:
2022年第1期
页码:
201-212
栏目:
人工智能院长论坛
出版日期:
2022-01-05
- Title:
-
A survey on event extraction in new domains
- 作者:
-
黄河燕1,2,3, 刘啸1,2,3
-
1. 北京理工大学 计算机学院, 北京 100081;
2. 北京海量语言信息处理与云计算应用工程研究中心, 北京 100081;
3. 北京理工大学 东南信息技术研究院, 福建 莆田 351100
- Author(s):
-
HUANG Heyan1,2,3, LIU Xiao1,2,3
-
1. School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China;
2. Beijing Engineering Research Center of High-Volume Language Information Processing and Cloud Computing Applications, Beijing 100081, China;
3. Southeast Academy of Information Technology, Beijing Institute of Technology, Putian 351100, China
-
- 关键词:
-
事件抽取; 新领域; 信息抽取; 事件模板推导; 联合抽取; 事件真实性检测; 自然语言处理; 知识库
- Keywords:
-
event extraction; new domains; information extraction; event schema induction; collective extraction; event factuality prediction; natural language processing; knowledge base
- 分类号:
-
TP391.4
- DOI:
-
10.11992/tis.202109045
- 摘要:
-
在当前互联网时代,大量新领域下的非结构文本数据中蕴含了海量信息。面向新领域的事件抽取方法研究能快速地构建领域知识库,用于支撑基于知识的下游应用。但现有事件抽取系统的领域限定性强,在新领域中从零构建会极度依赖事件体系和标注数据的质量及规模,需要大量人力和专家知识来定制模板和标注语料。而且数据集中常见在相同的上下文中出现多个相关联的事件实例,对事件抽取和真实性检测产生了极大阻碍。本文针对面向新领域的事件抽取这一新兴研究领域进行综述,从事件模板推导、多实例联合事件抽取、事件真实性检测三个研究方向介绍了相关工作的研究现状,并对目前存在的重点和难点问题进行了讨论,指出了下一步需要开展的研究工作。
- Abstract:
-
In the current Internet era, numerous unstructured text data in new domains often contain high-volume information. Studies on event extraction in new domains can accelerate building of domain knowledge bases, supporting downstream knowledge-based applications. However, the existing event extraction methods have substantial limitations of the domain. Building event extraction systems from scratch in new domains will heavily depend on the quality and scale of event schemas and annotated data, requiring a lot of human efforts and expertise. Moreover, it is common in the datasets that multiple associated event instances often appear in the same context, heavily hindering event extraction and factuality prediction. This paper summarizes the emerging research field of event extraction in new domains and investigates current research status from three directions: event schema induction, collective event extraction, and event factuality prediction. In addition, this paper discusses the existing difficulties and challengings and indicates the potential research work to be carried out in the future.
更新日期/Last Update:
1900-01-01