[1]MAO Mingyi,WU Chen,ZHONG Yixin,et al.BERT named entity recognition model with self-attention mechanism[J].CAAI Transactions on Intelligent Systems,2020,15(4):772-779.[doi:10.11992/tis.202003003]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
15
Number of periods:
2020 4
Page number:
772-779
Column:
吴文俊人工智能科学技术奖论坛
Public date:
2020-07-05
- Title:
-
BERT named entity recognition model with self-attention mechanism
- Author(s):
-
MAO Mingyi1; WU Chen1; ZHONG Yixin2; CHEN Zhicheng2
-
1. School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, China;
2. School of Computer, Beijing University of Posts and Telecommunications, Beijing 100876, China
-
- Keywords:
-
named entity recognition; bidirectional encoder representation from transformers; self-attention mechanism; deep learning; conditional random field; natural language processing; bi-directional long short-term memory; sequence tagging
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202003003
- Abstract:
-
Named entity recognition is a part of lexical analysis in the field of natural language processing. It is the basis for a computer to correctly understand natural language. In order to strengthen the recognition effect of the model on named entities, in this study, the pre-trained model BERT (bidirectional encoder representation from transformers) was used as the embedding layer of the model, and fixed parameter embedding was adopted to solve the problem of high computer performance required for BERT fine-tuning training. A BERT-BiLSTM-CRF model was built, and on the basis of this model, two improved experiments were carried out. Method one is to continue to add a self-attention layer. Experimental results show that the addition of the self-attention layer does not significantly improve the recognition effect of the model. Method two is to reduce the number of embedding layers of the BERT model. Experimental results show that moderately reducing the number of BERT embedding layers can improve the model’s named entity recognition accuracy, while saving the overall training time of the model. When using 9-layer embedding, thevalue on the MSRA Chinese data set increased to 94.79%, and thevalue on the Weibo Chinese data set reached 68.82%.