[1]YU Runyu,LI Yawen,LI Ang.Semantic similarity computing for scientific and technological conferences[J].CAAI Transactions on Intelligent Systems,2022,17(4):737-743.[doi:10.11992/tis.202203050]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
17
Number of periods:
2022 4
Page number:
737-743
Column:
学术论文—自然语言处理与理解
Public date:
2022-07-05
- Title:
-
Semantic similarity computing for scientific and technological conferences
- Author(s):
-
YU Runyu1; LI Yawen2; LI Ang1
-
1. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing 100876, China;
2. School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing 100876, China
-
- Keywords:
-
science and technological conference; deep learning; natural language processing; semantic learning; knowledge extraction; semantic similarity; pre-training model; siamese network
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202203050
- Abstract:
-
Aiming at the problem that the current semantic text similarity calculation methods have difficulty in calculating semantic similarity for scientific and technological conference data accurately, a siamese-BERT semantic similarity calculation algorithm for scientific and technological conferences fused with domain features (SBFD) is proposed in this paper. At first, the domain feature information of conference is obtained through entity recognition and keyword extraction, and it is input into the bidirectional encoder representations from transformers (BERT) network as a feature, together with conference information. The structure of the Siamese network is then used to solve the anisotropy problem of BERT. The output of the network is pooled and normalized, and finally the cosine similarity is used to calculate the similarity between the two conferences. Experimental results show that the SBFD algorithm achieves good results on different data sets, with the Spearman’s rank correlation coefficient improved in a certain extent.