[1]王一成,万福成,马宁.融合多层次特征的中文语义角色标注[J].智能系统学报,2020,15(1):107-113.[doi:10.11992/tis.201910012]
 WANG Yicheng,WAN Fucheng,MA Ning.Chinese semantic role labeling with multi-level linguistic features[J].CAAI Transactions on Intelligent Systems,2020,15(1):107-113.[doi:10.11992/tis.201910012]
点击复制

融合多层次特征的中文语义角色标注(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第15卷
期数:
2020年1期
页码:
107-113
栏目:
学术论文—自然语言处理与理解
出版日期:
2020-01-01

文章信息/Info

Title:
Chinese semantic role labeling with multi-level linguistic features
作者:
王一成12 万福成1 马宁2
1. 西北民族大学 中国民族语言文字信息技术教育部重点实验室, 甘肃 兰州 730030;
2. 西北民族大学 甘肃省民族语言智能处理重点实验室, 甘肃 兰州 730030
Author(s):
WANG Yicheng12 WAN Fucheng1 MA Ning2
1. Key Laboratory of China’s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, Gansu 730030, China;
2. Key Laboratory of China’s Ethnic Languages and Intelligent Processing of Gansu Province, Nor
关键词:
自然语言处理语义角色标注深度学习Bi-LSTM语言学特征后处理层Max pooling
Keywords:
natural language processingsemantic role labelingdeep learningBi-LSTMlinguistic characteristicspost-processing layerMax pooling
分类号:
TP391
DOI:
10.11992/tis.201910012
摘要:
随着人工智能和中文信息处理技术的迅猛发展,自然语言处理相关研究已逐步深入到语义理解层次上,而中文语义角色标注则是语义理解领域的核心技术。在统计机器学习仍占主流的中文信息处理领域,传统的标注方法对句子的句法及语义的解析程度依赖较大,因而标注准确率受限较大,已无法满足当前需求。针对上述问题,对基于Bi-LSTM的中文语义角色标注基础模型进行了改进研究,在模型后处理阶段结合了Max pooling技术,训练时融入了词法和句式等多层次的语言学特征,以实现对原有标注模型的深入改进。通过多组实验论证,结合语言学辅助分析,提出针对性的改进方法从而使模型标注准确率得到了显著提升,证明了结合Max pooling技术的Bi-LSTM语义角色标注模型中融入相关语言学特征能够改进模型标注效果。
Abstract:
With the rapid development of artificial intelligence and Chinese information processing technology, studies relating to natural language processing have reached the level of semantic understanding gradually, while Chinese Semantic Role Labeling is the key technology in the semantic understanding field. Traditional tagging methods depend heavily on the parsing degree of sentence syntax and semantics, so the accuracy of tagging is limited. Aiming at the above problems, this paper improves the basic model of Chinese Semantic Role Labeling based on Bi-LSTM. To solve the above problem, the Max pooling technology is combined in the post-processing stage of the model, and multi-level linguistic features such as lexical item and sentence pattern are integrated into the training to further improve the original annotation model. Through a number of experimental demonstrations, combined with linguistic assistant analysis, targeted improvement methods are proposed to improve the accuracy of model annotation. It is proved that the Bi-LSTM semantic role labeling model combined with Max pooling technology can improve the effect of model annotation by incorporating relevant linguistic features.

参考文献/References:

[1] PRADHAN S, HACIOGLU K, KRUGLER V, et al. Support vector learning for semantic argument classification[J]. Machine Learning Journal, 2005, 60(1/2/3): 11–39.
[2] PRADHAN S, WARD W, HACIOGLU K, MARTIN J, et al. Semantic role labeling using different syntactic views[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Stroudsburg, USA, 2005: 581–588.
[3] BLUNSOM P. Maximum entropy markov models for semantic role labelling[C]//Proceedings of Australasian Language Technology Workshop 2004. Sydney, Australia, 2004: 109–116.
[4] COHN T, BLUNSOM P. Semantic role labelling with tree conditional random fields[C]//Proceedings of the 9th Conference on Computational Natural Language Learning. Ann Arbor, Michigan, 2005: 169–172.
[5] COLLOBERT R, WESTON J. A unified architecture for natural language processing: deep neural networks with multitask learning[C]//Proceedings of the 25th Inter- national Conference on Machine Learning. New York, USA, 2008: 160–167.
[6] SOCHER R, HUANG E H, PENNINGTON J, et al. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection[C]//Proceedings of the Advances in Neural Information Processing Systems. Granada, Spain, 2011: 801–809.
[7] YIN W P, SCHüTZE H. Convolutional neural network for paraphrase identification[C]//Proceedings of 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Denver, USA, 2015: 901–911.
[8] ZHOU J, XU W. End-to-end learning of semantic role labeling using recurrent neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China, 2015: 1127–1137.
[9] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735–1780.
[10] 于江德, 樊孝忠, 庞文博, 等. 基于条件随机场的语义角色标注[J]. 东南大学学报, 2007, 23(3): 361–364
YU Jiangde, FAN Xiaozhong, PANG Wenbo, et al. Semantic role labeling based on conditional random field[J]. Journal of southeast university, 2007, 23(3): 361–364
[11] WANG Zhen, JIANG Tingsong, CHANG Baobao, et al. Chinese semantic role labeling with bidirectional recurrent neural networks[C]//Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing, 2015: 1626–1631.
[12] 王臻, 常宝宝, 穗志方. 基于分层输出神经网络的汉语语义角色标注[J]. 中文信息学报, 2014, 28(6): 56–61
WANG Zhen, CHANG Baobao, SUI Zhifang. Chinese semantic role labeling based on neural network with optimized output layer[J]. Journal of Chinese information processing, 2014, 28(6): 56–61
[13] 王明轩, 刘群. 基于深度神经网络的语义角色标注[J]. 中文信息学报, 2018, 32(2): 50–57
WANG Mingxuan, LIU Qun. A simple and effective deep model for semantic role labeling[J]. Journal of Chinese information processing, 2018, 32(2): 50–57
[14] 李天时, 李琦, 王文辉, 等. 基于外部记忆单元和语义角色知识的文本复述判别模型[J]. 中文信息学报, 2017, 31(6): 33–40
LI Tianshi, LI Qi, WANG Wenhui, et al. Paraphrase identification with external memory and SRL knowledge[J]. Journal of Chinese information processing, 2017, 31(6): 33–40
[15] 杨耀文. 基于神经网络模型的汉语框架语义角色识别[D]. 山西大学, 2016.
YANG Yaowen. Identification of Chinese FrameNet semantic role based on neural networds model[D]. Shanxi University, 2016.
[16] Sameer Pradhan, Kadri Hacioglu, Wayne Ward, et al. Semantic role chunking combining complementary syntactic views[C]//Proceedings of the Conference on Computational Natural Language Learning, 2005:217–220.
[17] 何保荣, 邱立坤, 孙盼盼. 基于句式与句模对应规则的语义角色标注[J]. 中文信息学报, 2018, 32(4): 59–65
HE Baorong, QIU Likun, SUN Panpan. Semantic role labeling based on correspondence rules between syntactic pattern and semantic pattern of sentences[J]. Journal of Chinese information processing, 2018, 32(4): 59–65
[18] 杨凤玲, 周俏丽, 蔡东风, 等. 结合短语结构句法的语义角色标注[J]. 中文信息学报, 2018, 32(6): 1–11
YANG Fengling, ZHOU Qiaoli, CAI Dongfeng, et al. Semantic role labeling combined with phrase structure prasing[J]. Journal of Chinese information processing, 2018, 32(6): 1–11
[19] 谢先章, 王兆凯, 李亚星, 等. 基于卷积神经网络的跨领域语义信息检索研究[J]. 计算机应用与软件, 2018, 35(08): 73–78
XIE Xianzhang, WANG Zhaokai, LI Yaxing, et al. Cross-domain semantic information retrieval based on convolutional neural network[J]. Computer applications and software, 2018, 35(08): 73–78
[20] 王策, 万福成, 于洪志, 等. 基于Bi-LSTM和Max Pooling的答案句抽取技术[J]. 吉林大学学报(信息科学版), 2019, 37(4): 390–398
WANG Ce, WAN Fucheng, YU Hongzhi, et al. Answer sentence extraction technology based on Bi-LSTM and Max Pooling[J]. Journal of jilin university(information science edition), 2019, 37(4): 390–398
[21] WANG Y C, WAN F C, MA N, et al. Research on chinese semantic role labeling with hierarchical syntactic clues[C]//Proceedings of the 3rd International Conference on Economics and Management, Education, Humanities and Social Sciences. Suzhou, China, 2019: 190–196.
[22] 万福成. 基于改进混沌分区算法的模糊信息抽取[J]. 计算机应用研究, 2019, 36(10): 2952–2954, 2970
WAN Fucheng. Fuzzy information extraction based on improved chaotic partition algorithm[J]. Application research of computers, 2019, 36(10): 2952–2954, 2970

相似文献/References:

[1]李 蕾,周延泉,钟义信.基于语用的自然语言处理研究与应用初探[J].智能系统学报,2006,1(02):1.
 LI Lei,ZHOU Yan-quan,ZHONG Yi-xin.Pragmatic Information Based NLP Research and Application[J].CAAI Transactions on Intelligent Systems,2006,1(1):1.
[2]张珂,陈奇.基于非受限路径自然语言处理中的机器人导航[J].智能系统学报,2017,12(04):482.[doi:10.11992/tis.201607016]
 ZHANG Ke,CHEN Qi.Robot navigation based on non-restricted route natural language processing[J].CAAI Transactions on Intelligent Systems,2017,12(1):482.[doi:10.11992/tis.201607016]
[3]李德毅.AI——人类社会发展的加速器[J].智能系统学报,2017,12(05):583.[doi:10.11992/tis.201710016]
 LI Deyi.Artificial intelligence:an accelerator for the development of human society[J].CAAI Transactions on Intelligent Systems,2017,12(1):583.[doi:10.11992/tis.201710016]
[4]陈培,景丽萍.融合语义信息的矩阵分解词向量学习模型[J].智能系统学报,2017,12(05):661.[doi:10.11992/tis.201706012]
 CHEN Pei,JING Liping.Word representation learning model using matrix factorization to incorporate semantic information[J].CAAI Transactions on Intelligent Systems,2017,12(1):661.[doi:10.11992/tis.201706012]
[5]张森,张晨,林培光,等.基于用户查询日志的网络搜索主题分析[J].智能系统学报,2017,12(05):668.[doi:10.11992/tis.201706096]
 ZHANG Sen,ZHANG Chen,LIN Peiguang,et al.Web search topic analysis based on user search query logs[J].CAAI Transactions on Intelligent Systems,2017,12(1):668.[doi:10.11992/tis.201706096]
[6]毛明毅,吴晨,钟义信,等.加入自注意力机制的BERT命名实体识别模型[J].智能系统学报,2020,15(4):772.[doi:10.11992/tis.202003003]
 MAO Mingyi,WU Chen,ZHONG Yixin,et al.BERT named entity recognition model with self-attention mechanism[J].CAAI Transactions on Intelligent Systems,2020,15(1):772.[doi:10.11992/tis.202003003]

备注/Memo

备注/Memo:
收稿日期:2019-10-11。
基金项目:国家自然科学基金项目(61602387,61762076)
作者简介:王一成,硕士研究生,主要研究方向为自然语言处理、自动问答;万福成,副教授,主要研究方向为自然语言处理、机器翻译、信息抽取、自动问答。主持和参与国家级、省部级项目10余项。获得专利及软件著作权10余项。出版著作4部,发表学术论文20余篇;马宁,教授,主要研究方向为自然语言处理、计算机应用。主持及参与国家自然科学基金项目3项。出版学术著作1部,发表学术论文40余篇
通讯作者:万福成.E-mail:wanfucheng@126.com
更新日期/Last Update: 1900-01-01