[1]徐坚.语义图支持的阅读理解型问题的自动生成[J].智能系统学报,2024,19(2):420-428.[doi:10.11992/tis.202207001]
XU Jian.Generating reading comprehension questions automatically based on semantic graphs[J].CAAI Transactions on Intelligent Systems,2024,19(2):420-428.[doi:10.11992/tis.202207001]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
19
期数:
2024年第2期
页码:
420-428
栏目:
学术论文—自然语言处理与理解
出版日期:
2024-03-05
- Title:
-
Generating reading comprehension questions automatically based on semantic graphs
- 作者:
-
徐坚1,2
-
1. 云南师范大学 民族教育信息化教育部重点实验室, 云南 昆明 650500;
2. 曲靖师范学院 信息工程学院, 云南 曲靖 655011
- Author(s):
-
XU Jian1,2
-
1. Key Laboratory of Educational Informatization for Nationalities, Yunnan Normal University, Kunming 650500, China;
2. School of Information Engineering, Qujing Normal University, Qujing 655011, China
-
- 关键词:
-
语义图; 数据集; 自动问题生成模型; 编码器; 解码器; 答案标记; 图注意力网络; 门控循环单元
- Keywords:
-
semantic graph; dataset; automatic question generation model; encoder; decoder; answer tagging; graph attention network; gated recurrent units
- 分类号:
-
TP311
- DOI:
-
10.11992/tis.202207001
- 文献标志码:
-
2023-11-16
- 摘要:
-
问题自动生成是人工智能领域的一项技术,其目标是根据输入的文本模拟人类的能力,自动生成相关问题。目前的问题自动生成研究主要基于通用数据集生成问题,缺乏专门针对教育领域的问题生成研究。为此,专注于面向中学生的问题自动生成进行研究。构建一个专门为问题生成模型训练需求而设计的数据集RACE4QG,以满足中学生教育领域的独特需求;开发一个端到端的问题自动生成模型,该模型训练于数据集RACE4Q,并采用改进型“编码器?解码器”方案,编码器主要采用两层双向门控循环单元,其输入为单词和答案标记的嵌入表示,编码器的隐藏层采用门控自注意力机制获得“文章和答案”的联合表示后,再输入到解码器生成问题。试验结果显示,该模型优于最优基线模型,3个评价指标BLEU-4、ROUGE-L和METEOR分别提高了3.61%、1.66%和1.44%。
- Abstract:
-
Automatic question generation is a technology in the field of artificial intelligence. Its goal is to simulate human capabilities and automatically generate relevant questions based on input text. Current research on automatic question generation is mainly based on generating questions from general datasets, and there is a lack of research on question generation specifically targeting the field of education. To this end, this article focuses on the automatic generation of questions for middle school students.First, this article constructs a dataset RACE4QG specifically designed for the training needs of question generation models to meet the unique needs of the field of middle school student education. Secondly, we developed an end-to-end automatic problem generation model, which was trained on the RACE4Q dataset.In the improved "encoder-decoder" scheme, the encoder mainly adopts a two-layer bidirectional gated recurrent unit, whose input is the word embedding and answer-tagging embedding, and the hidden layer of the encoder adopts the gated self-attention mechanism to obtain the passage-answer representation, which is then fed to the decoder to generate questions. The experimental results show that the model in this paper is better than the optimal baseline model, and the three evaluation indicators BLEU-4, ROUGE-L, and METEOR are improved by 3.61, 1.66, and 1.44 points, respectively.
更新日期/Last Update:
1900-01-01