[1]李云洁,王丹阳,刘海涛,等.图推理嵌入动态自注意力网络的文档级关系抽取[J].智能系统学报,2025,20(1):52-63.[doi:10.11992/tis.202311021]
LI Yunjie,WANG Danyang,LIU Haitao,et al.Document-level relation extraction of a graph reasoning embedded dynamic self-attention network[J].CAAI Transactions on Intelligent Systems,2025,20(1):52-63.[doi:10.11992/tis.202311021]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
20
期数:
2025年第1期
页码:
52-63
栏目:
学术论文—机器学习
出版日期:
2025-01-05
- Title:
-
Document-level relation extraction of a graph reasoning embedded dynamic self-attention network
- 作者:
-
李云洁1, 王丹阳2, 刘海涛1,3, 汪华东4, 汪培庄3
-
1. 辽宁工程技术大学 理学院, 辽宁 阜新 123000;
2. 中国热带农业科学院科技信息研究所, 海南 海口 571000;
3. 辽宁工程技术大学 智能工程与数学研究院, 辽宁 阜新123000;
4. 清华大学 计算机系, 北京 100084
- Author(s):
-
LI Yunjie1, WANG Danyang2, LIU Haitao1,3, WANG Huadong4, WANG Peizhuang3
-
1. Institute of Mathematics and Systems Science, Liaoning Technical University, Fuxin 123000, China;
2. Institute of Scientific and Technical Information, Chinese Academy of Tropical Agricultural Sciences, Haikou 571000, China;
3. Institute of Intelligence Engineering and Mathematics, Liaoning Technical University, Fuxin 123000, China;
4. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
-
- 关键词:
-
文档级关系抽取; 图推理; 动态自注意力网络; 自注意力机制; 门限词选择机制; 文档图; 图注意力网络; 关键词
- Keywords:
-
document-level relation extraction; graph reasoning; dynamic self-attention network; self-attention mechanism; gated token selection mechanism; document graph; graph attention network; key word
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202311021
- 摘要:
-
文档级关系抽取是指从文档中抽取所有具有语义关系的实体对并判断其关系类别,与句子级关系抽取不同,这里实体关系的确定需要根据文档中多个句子推理得到。现有方法主要采用自注意力进行文档级关系抽取,但是运用自注意力进行文档级关系抽取需要面临两个技术挑战:即长文本语义编码存在的高计算复杂度和关系预测需要的复杂推理建模,故提出一种图推理嵌入动态自注意力网络(graph reasoning embedded dynamic self-attention network, GSAN)模型。该模型借助门限词选择机制动态选择重要词计算自注意力实现对长文本语义依赖的高效建模,同时考虑以选择词为全局语义背景与实体候选、文档节点一起构建文档图,将文档图的图推理聚合信息嵌入到动态自注意力模块中,实现模型对复杂推理建模的能力。在公开的文档级关系数据集CDR和DocRED上的实验结果表明,文中提出的模型较其他基线模型有显著提升。
- Abstract:
-
Document-level relation extraction refers to the extraction of all entity pairs with semantic relationships from documents and judging their relationship categories. It is different from sentence-level relation extraction, where the determination of entity relationships needs to be inferred from multiple sentences in the document. The existing methods mainly use self-attention for document-level relation extraction, but the use of self-attention for document-level relation extraction needs to address two technical challenges: the high computational complexity of long text semantic encoding and the complex reasoning modeling required for relationship prediction. Therefore, a graph reasoning embedded dynamic self-attention network model (GSAN) is proposed. With the aid of gated word selection mechanism, GSAN dynamically selects important words to calculate self attention, achieving high-efficiency modeling for semantic dependency of long text sequences. At the same time, it is considered to construct a document graph with the word selection as the global semantic background, entity candidates and document nodes. Then, the graph reasoning aggregation information of the document graph being embedded into the dynamic self-attention module enables the model to model complex reasoning. The experimental results demonstrate that the proposed model is a significant improvement over other baseline models on the public document-level relation dataset CDR and DocRED.
备注/Memo
收稿日期:2023-11-17。
基金项目:国家自然科学基金项目(61350003);辽宁省教育厅高等学校基本科研项目重点攻关项目(LJKZZ20220047);中央级公益性科研院所基本科研业务费专项(1630072023005).
作者简介:李云洁,硕士研究生,主要研究方向为自然语言处理、机器学习。E-mail:2621991259@qq.com。;王丹阳,研究实习员,主要研究方向为智能信息处理。E-mail:danyang.wang@catas.cn。;刘海涛,副教授,博士,主要研究方向为自然语言处理、机器学习、因素空间理论。中国运筹学会模糊信息与工程分会理事。曾获市科技进步一等奖1项,市级自然科学学术成果奖特等奖、一等奖各1项。以第一作者身份发表学术论文30余篇,出版专著《因素空间与人工智能》获国家出版基金资助,并被评为“十三五”国家重点图书出版规划重大出版工程项目。E-mail:haitao641@163.com。
通讯作者:刘海涛. E-mail:haitao641@163.com
更新日期/Last Update:
2025-01-05