[1]YU Runyu,DU Junping,XUE Zhe,et al.Research on named entity recognition for scientific and technological conferences[J].CAAI Transactions on Intelligent Systems,2022,17(1):50-58.[doi:10.11992/tis.202107010]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
17
Number of periods:
2022 1
Page number:
50-58
Column:
学术论文—机器学习
Public date:
2022-01-05
- Title:
-
Research on named entity recognition for scientific and technological conferences
- Author(s):
-
YU Runyu1; DU Junping1; XUE Zhe1; XU Xin1; XI Junqing2
-
1. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing 100876, China;
2. Judicial Information Centre, Beijing 100020, China
-
- Keywords:
-
named entity recognition; long-short term memory network; attention mechanism; character-word fusion; accurate portrait; natural language processing; information extraction; pre-trained models
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202107010
- Abstract:
-
Aiming at the problem that the named entity recognition algorithm in the general field cannot fully mine the semantic information in the scientific and technological academic conference paper data, a scientific and technological conference named entity recognition algorithm based on the combination of keyword-character long-short term memory (LSTM) and attention mechanism is proposed. First, pretraining of keyword features in the data set is conducted to obtain the latent semantic information at the vocabulary level, and merge it with the semantic information at the character level to solve the problem that the wrong vocabulary boundary affects recognition accuracy. Then, the bi-directional long-short term memory (BiLSTM) and the vector outputs of the attention mechanism are fused, and the contextual and global information is considered. Finally, conditional random field (CRF) is used to identify entities. Experimental results show that the proposed algorithm has achieved better recognition results on different data sets. Compared with the comparison algorithms, the accuracy, recall, and F1 index of the proposed algorithm have been improved.