[1]武南南,郭泽浩,赵一鸣,等.应用双曲空间特征融合的姓名消歧方法研究[J].智能系统学报,2024,19(1):79-88.[doi:10.11992/tis.202209029]
WU Nannan,GUO Zehao,ZHAO Yiming,et al.Name disambiguation method based on hyperbolic space feature fusion[J].CAAI Transactions on Intelligent Systems,2024,19(1):79-88.[doi:10.11992/tis.202209029]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
19
期数:
2024年第1期
页码:
79-88
栏目:
学术论文—机器感知与模式识别
出版日期:
2024-01-05
- Title:
-
Name disambiguation method based on hyperbolic space feature fusion
- 作者:
-
武南南1, 郭泽浩1, 赵一鸣1, 甄紫旭1, 王文俊1, 柳研2
-
1. 天津大学 智能与计算学部, 天津 300354;
2. 安徽大学 计算机科学与技术学院, 安徽 合肥 230039
- Author(s):
-
WU Nannan1, GUO Zehao1, ZHAO Yiming1, ZHEN Zixu1, WANG Wenjun1, LIU Yan2
-
1. College of Intelligence and Computing, Tianjin University, Tianjin 300354, China;
2. School of Computer Science and Technology, Anhui University, Hefei 230039, China
-
- 关键词:
-
姓名消歧; 欧氏空间; 双曲空间; 网络对齐; 网络表示学习; 图嵌入; 特征融合; 锚链接预测
- Keywords:
-
name disambiguation; Euclidean space; hyperbolic space; network alignment; network representation learning; graph embedding; feature fusion; anchor link prediction
- 分类号:
-
TP39
- DOI:
-
10.11992/tis.202209029
- 文献标志码:
-
2023-08-02
- 摘要:
-
针对传统用户影响力分析等研究遇到姓名重名的挑战,姓名歧义的影响日益严重的问题,本文基于双曲空间结合欧氏空间进行特征融合,提出融合多空间特征的网络对齐方法(geometry interaction network alignment, GINA),有效建模网络结构对用户姓名消歧的主要作用。本文同时使用欧氏空间和双曲空间进行网络表示学习,以获取具有不同空间特点的网络结构信息,使用跨空间网络映射及跨空间特征融合在尽量减少空间映射损失的情况下实现不同空间的信息交互得到最终的网络表示,进行网络对齐,进而实现姓名消歧。本文在中文论文合作网络、英文论文合作网络和中文专利合作网络上,两两对齐构建论文-专利实证数据网络对齐数据集和中文-英文实证数据网络对齐数据集,开展GINA模型在网络对齐数据集上对重名人员身份识别和中外论文身份识别2个实证场景试验验证,双曲空间融合欧氏空间相比单一空间精确率增加了24.9%,验证了GINA方法的有效性。
- Abstract:
-
In view of the challenge of name duplication and the increasingly serious influence of name ambiguity in traditional user influence analysis and other research, the impact of name ambiguity is becoming increasingly serious. This paper proposes a network alignment model – geometry interaction network alignment (GINA) based on the fusion of hyperbolic space and Euclidean space features, fusing multiple spatial features. It effectively establishes a model to show the main function of a network structure for name disambiguation. The fundamental idea of this paper is to simultaneously utilize both Euclidean space and hyperbolic space for network representation learning, aiming to capture network structural information with distinct spatial characteristics. It employs cross-space network mapping and cross-space feature fusion to realize information exchange among different spaces and final network representation under the situations of reducing loss of spatial mapping as far as possible, implements network alignment and further name disambiguation. By performing network alignment based on the obtained representations, the paper accomplishes name disambiguation. On real datasets, the Chinese paper co-authorship network, English paper co-authorship network, and the Chinese patent co-authorship network are aligned in pair to construct the "Paper-Patent" empirical data network alignment dataset and the "Chinese-English" empirical data network alignment dataset to carry out the test demonstration of GINA model in two empirical scenarios for the identity recognition of the individuals with the same name and Chinese & foreign papers. The results show that the precision in the hyperbolic space combined with the Euclidean space is at least 24.9% higher than that in a single space, confirming effectiveness of the GINA method.
备注/Memo
收稿日期:2022-09-15。
基金项目:青海省重点研发与转化计划项目(2022-QY-218).
作者简介:武南南,副教授,计算机学会高级会员,主要研究方向为人工智能、图异常挖掘。参与国家重点研发计划项目2项、主持重点研发计划项目1项,获天津市优秀智库成果三等奖。发表学术论文10余篇。E-mail:nannan.wu@tju.edu.cn。;郭泽浩,硕士研究生,主要研究方向为图异常检测。E-mail:3018208080@tju.edu.cn;赵一鸣,博士研究生,主要研究方向为人工智能、图异常挖掘。E-mai:945160031@qq.com
通讯作者:赵一鸣. E-mail:945160031@qq.com
更新日期/Last Update:
1900-01-01