[1]张洪溪,才智杰.面向自动问答的藏文动词结尾型数据集构建[J].智能系统学报,2025,20(5):1207-1216.[doi:10.11992/tis.202410002]
 ZHANG Hongxi,CAI Zhijie.Construction of a Tibetan verb-ending type dataset for automatic question answering[J].CAAI Transactions on Intelligent Systems,2025,20(5):1207-1216.[doi:10.11992/tis.202410002]
点击复制

面向自动问答的藏文动词结尾型数据集构建

参考文献/References:
[1] 文森, 钱力, 胡懋地, 等. 基于大语言模型的问答技术研究进展综述[J]. 数据分析与知识发现, 2024, 8(6): 16-29.
WEN Sen, QIAN Li, HU Maodi, et al. Review of research progress on question-answering techniques based on large language models[J]. Data analysis and knowledge discovery, 2024, 8(6): 16-29.
[2] 王娜, 李杰. 基于AHP-熵权法的FAQ问答系统用户满意度评价研究: 以高校图书馆问答型机器人为例[J]. 情报科学, 2023, 41(9): 164-172.
WANG Na, LI Jie. User satisfaction evaluation of FAQ system based on AHP-entropy weight method: taking the question answering robot of university library as an example[J]. Information science, 2023, 41(9): 164-172.
[3] 车万翔, 窦志成, 冯岩松, 等. 大模型时代的自然语言处理: 挑战、机遇与发展[J]. 中国科学: 信息科学, 2023, 53(9): 1645-1687.
CHE Wanxiang, DOU Zhicheng, FENG Yansong, et al. Towards a comprehensive understanding of the impact of large language models on natural language processing: challenges, opportunities and future directions[J]. Scientia sinica (informationis), 2023, 53(9): 1645-1687.
[4] 才智杰. 面向自然语言处理的藏文句型结构分布统计(13BYY141)研究工作报告[R]. 青海: 国家社科基金项目, 2016.
[5] RAJPURKAR P, ZHANG Jian, LOPYREV K, et al. SQuAD: 100, 000+ questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in NaturalLanguage Processing. Austin: ACL, 2016: 2383-2392.
[6] BAJAJ P, CAMPOS D, CRASWELL N, et al. MS MARCO: a human generated MAchine reading COmprehension dataset[EB/OL]. (2018-10-31)[2024-10-02]. https://arxiv.org/abs/1611.09268v3.
[7] JOSHI M, CHOI E, WELD D, et al. TriviaQA: a large scale distantly supervised challenge dataset forReading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver: ACL, 2017: 1601-1611.
[8] SAHA A, ARALIKATTE R, KHAPRA M M, et al. DuoRC: towards complex language understanding with paraphrased reading comprehension[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne: ACL, 2018: 1683-1693.
[9] KO?ISKY T, SCHWARZ J, BLUNSOM P, et al. The NarrativeQA reading comprehension challenge[J]. Transactions of the association for computational linguistics, 2018, 6: 317-328.
[10] LAI Guokun, XIE Qizhe, LIU Hanxiao, et al. RACE: large-scale ReAding comprehension dataset from examinations[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen: ACL, 2017: 785-794.
[11] RICHARDSON M, BURGES C J C, RENSHAW E. MCTest: a challenge dataset for the open-domain machine comprehension of text[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle: ACL, 2013: 193-203.
[12] HUANG Lifu, LE BRAS R, BHAGAVATULA C, et al. Cosmos QA: machine reading comprehension with contextual commonsense reasoning[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong: ACL, 2019: 2391-2401.
[13] HERMANN K M, KO?ISKY T, GREFENSTETTE E, et al. Teaching machines to read and comprehend[J]. Advances in neural information processing systems, 2015, 28: 1693-1701.
[14] ONISHI T, WANG Hai, BANSAL M, et al. Who did what: a large-scale person-centered cloze dataset[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin: ACL, 2016: 2230-2235.
[15] CUI Yiming, LIU Ting, CHE Wanxiang, et al. A span-extraction dataset for Chinese machine reading comprehension[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong: ACL, 2019: 5883-5889.
[16] SHAO C C, LIU T, LAI Yuting, et al. DRCD: a Chinese machine reading comprehension dataset[EB/OL]. (2019-05-29)[2024-10-02]. https://arxiv.org/abs/1806.00920v3.
[17] HE Wei, LIU Kai, LIU Jing, et al. DuReader: a Chinese machine reading comprehension dataset from real-world applications[C]//Proceedings of the Workshop on Machine Reading for Question Answering. Melbourne: ACL, 2018: 37-46.
[18] XU Canwen, PEI Jiaxin, WU Hongtao, et al. MATINF: a jointly labeled large-scale dataset for classification, question answering and summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. [S. l. ]: ACL, 2020: 3586-3596.
[19] ZHONG Haoxi, XIAO Chaojun, TU Cunchao, et al. JEC-QA: a legal-domain question answering dataset[J]. Proceedings of the AAAI conference on artificial intelligence, 2020, 34(5): 9701-9708.
[20] SUN Kai, YU Dian, YU Dong, et al. Investigating prior knowledge for challenging Chinese machine reading comprehension[J]. Transactions of the association for computational linguistics, 2020, 8: 141-155.
[21] ZHENG Chujie, HUANG Minlie, SUN Aixin. ChID: a large-scale Chinese IDiom dataset for cloze test[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019: 778-787.
[22] 孙媛, 旦正错, 刘思思, 等. 面向机器阅读理解的藏文数据集TibetanQA[J]. 中国科学数据, 2022, 7(2): 34-42.
SUN Yuan, DAN Zhengcuo, LIU Sisi, et al. TibetanQA: a dataset of Tibetan for Machine reading comprehension[J]. China scientific data, 2022, 7(2): 34-42.
[23] 孙媛, 刘思思, 陈超凡, 等. 面向机器阅读理解的高质量藏语数据集构建[J]. 中文信息学报, 2024, 38(3): 56-64.
SUN Yuan, LIU Sisi, CHEN Chaofan, et al. Construction of high-quality Tibetan dataset for machine reading comprehension[J]. Journal of Chinese information processing, 2024, 38(3): 56-64.
[24] 史晓东, 卢亚军. 央金藏文分词系统[J]. 中文信息学报, 2011, 25(4): 54-56.
SHI Xiaodong, LU Yajun. A Tibetan segmentation system: Yangjin[J]. Journal of Chinese information processing, 2011, 25(4): 54-56.
[25] 格桑居冕, 格桑央京. 实用藏文文法教程[M]. 成都: 四川民族出版社, 2004.
[26] 班玛宝, 才智杰, 拉玛扎西. 基于PCFG的藏文疑问句句法分析[J]. 中文信息学报, 2019, 33(2): 67-74.
BAN Mabao, CAI Zhijie, LA M. Tibetan interrogative sentences parsing based on PCFG[J]. Journal of Chinese information processing, 2019, 33(2): 67-74.
[27] SEO M, KEMBHAVI A, FARHADI A, et al. Bidirectional attention flow for machine comprehension[EB/OL]. (2018-06-21)[2024-10-02]. https://arxiv.org/abs/1611.01603v6.
[28] WANG Wenhui, YANG Nan, WEI Furu, et al. Gated self-matching networks for reading comprehension and question answering[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver: ACL, 2017: 189-198.
[29] YU A W, DOHAN D, LUONG M T, et al. QANet: combining local convolution with global self-attention for reading comprehension[EB/OL]. (2018-04-23)[2024-10-02]. https://arxiv.org/abs/1804.09541v1.
相似文献/References:
[1]李 蕾,周延泉,钟义信.基于语用的自然语言处理研究与应用初探[J].智能系统学报,2006,1(2):1.
 LI Lei,ZHOU Yan-quan,ZHONG Yi-xin.Pragmatic Information Based NLP Research and Application[J].CAAI Transactions on Intelligent Systems,2006,1():1.
[2]李德毅.AI——人类社会发展的加速器[J].智能系统学报,2017,12(5):583.[doi:10.11992/tis.201710016]
 LI Deyi.Artificial intelligence:an accelerator for the development of human society[J].CAAI Transactions on Intelligent Systems,2017,12():583.[doi:10.11992/tis.201710016]
[3]陈培,景丽萍.融合语义信息的矩阵分解词向量学习模型[J].智能系统学报,2017,12(5):661.[doi:10.11992/tis.201706012]
 CHEN Pei,JING Liping.Word representation learning model using matrix factorization to incorporate semantic information[J].CAAI Transactions on Intelligent Systems,2017,12():661.[doi:10.11992/tis.201706012]
[4]张森,张晨,林培光,等.基于用户查询日志的网络搜索主题分析[J].智能系统学报,2017,12(5):668.[doi:10.11992/tis.201706096]
 ZHANG Sen,ZHANG Chen,LIN Peiguang,et al.Web search topic analysis based on user search query logs[J].CAAI Transactions on Intelligent Systems,2017,12():668.[doi:10.11992/tis.201706096]
[5]王一成,万福成,马宁.融合多层次特征的中文语义角色标注[J].智能系统学报,2020,15(1):107.[doi:10.11992/tis.201910012]
 WANG Yicheng,WAN Fucheng,MA Ning.Chinese semantic role labeling with multi-level linguistic features[J].CAAI Transactions on Intelligent Systems,2020,15():107.[doi:10.11992/tis.201910012]
[6]毛明毅,吴晨,钟义信,等.加入自注意力机制的BERT命名实体识别模型[J].智能系统学报,2020,15(4):772.[doi:10.11992/tis.202003003]
 MAO Mingyi,WU Chen,ZHONG Yixin,et al.BERT named entity recognition model with self-attention mechanism[J].CAAI Transactions on Intelligent Systems,2020,15():772.[doi:10.11992/tis.202003003]
[7]胡康,何思宇,左敏,等.基于CNN-BLSTM的化妆品违法违规行为分类模型[J].智能系统学报,2021,16(6):1151.[doi:10.11992/tis.202104001]
 HU Kang,HE Siyu,ZUO Min,et al.Classification model for judging illegal and irregular behavior for cosmetics based on CNN-BLSTM[J].CAAI Transactions on Intelligent Systems,2021,16():1151.[doi:10.11992/tis.202104001]
[8]喻波,王志海,孙亚东,等.非结构化文档敏感数据识别与异常行为分析[J].智能系统学报,2021,16(5):932.[doi:10.11992/tis.202104028]
 YU Bo,WANG Zhihai,SUN Yadong,et al.Unstructured document sensitive data identification and abnormal behavior analysis[J].CAAI Transactions on Intelligent Systems,2021,16():932.[doi:10.11992/tis.202104028]
[9]于润羽,杜军平,薛哲,等.面向科技学术会议的命名实体识别研究[J].智能系统学报,2022,17(1):50.[doi:10.11992/tis.202107010]
 YU Runyu,DU Junping,XUE Zhe,et al.Research on named entity recognition for scientific and technological conferences[J].CAAI Transactions on Intelligent Systems,2022,17():50.[doi:10.11992/tis.202107010]
[10]黄河燕,刘啸.面向新领域的事件抽取研究综述[J].智能系统学报,2022,17(1):201.[doi:10.11992/tis.202109045]
 HUANG Heyan,LIU Xiao.A survey on event extraction in new domains[J].CAAI Transactions on Intelligent Systems,2022,17():201.[doi:10.11992/tis.202109045]

备注/Memo

收稿日期:2024-10-2。
基金项目:国家自然科学基金项目(61966031,61866032);藏文信息处理教育部重点实验室项目(2013-Z-Y17, 2014-Z-Y32, 2015-Z-Y03).
作者简介:张洪溪,硕士研究生,主要研究方向为藏文信息处理、藏语自然语言处理。E-mail:1036974179@qq.com。;才智杰,教授,博士生导师,博士。主要研究方向为藏文信息处理、藏语自然语言处理。发表学术论文64篇。E-mail:czjqhsd@163.com。
通讯作者:才智杰. E-mail:Czjqhsd@163.com

更新日期/Last Update: 2025-09-05
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com