[1]胡文彬,陈龙,黄贤波,等.融合交叉注意力的突发事件多模态中文反讽识别模型[J].智能系统学报,2024,19(2):392-400.[doi:10.11992/tis.202212011]
HU Wenbin,CHEN Long,HUANG Xianbo,et al.A multimodal Chinese sarcasm detection model for emergencies based on cross attention[J].CAAI Transactions on Intelligent Systems,2024,19(2):392-400.[doi:10.11992/tis.202212011]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
19
期数:
2024年第2期
页码:
392-400
栏目:
学术论文—自然语言处理与理解
出版日期:
2024-03-05
- Title:
-
A multimodal Chinese sarcasm detection model for emergencies based on cross attention
- 作者:
-
胡文彬1,2, 陈龙1, 黄贤波1, 陈晨1, 仲兆满1,2
-
1. 江苏海洋大学 计算机工程学院, 江苏 连云港 222005;
2. 江苏省海洋资源开发研究院, 江苏 连云港 222005
- Author(s):
-
HU Wenbin1,2, CHEN Long1, HUANG Xianbo1, CHEN Chen1, ZHONG Zhaoman1,2
-
1. School of Computer Engineering, Jiangsu Ocean University, Lianyungang 222005, China;
2. Jiangsu Institute of Marine Resources Development, Lianyungang 222005, China
-
- 关键词:
-
突发事件; 社交媒体; 多模态评论; 中文反讽识别; 中文反讽数据集; 交叉注意力机制; 注意力机制; 情感分析
- Keywords:
-
emergency; social media; multimodal comment; Chinese sarcasm detection; Chinese sarcasm dataset; cross-attention mechanism; attention mechanism; sentiment analysis
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202212011
- 文献标志码:
-
2023-12-21
- 摘要:
-
网民在社交媒体参与突发事件讨论时,时常会采用反讽修辞方式表达对事件的看法,此举导致情感分析的难度增加,且已有中文反讽识别对社交媒体中网民发布的多模态评论研究较少,有必要对图文多模态中文反讽识别进行深入研究。运用交叉注意力机制捕捉模态间的不一致性表达,提出融合交叉注意力的多模态中文反讽识别模型(fuse cross attention model, FCAM)。在模型中,首先运用TextCNN(text convolutional neural networks) 和ResNet(deep residual network)分别提取中文文本浅层特征和图像特征,再运用交叉注意力机制分别得到文本层和图像层的注意力特征,按照残差方式分别实现文本浅层特征和文本层注意力特征的连接、图像特征和图像层注意力特征的连接,使用注意力机制融合2个特征表示,经过分类层得到反讽分类结果。基于某一地区新冠疫情期间相关话题的微博评论数据,构建出突发公共卫生事件多模态中文反讽数据集,在该数据集上试验验证,相较于基准模型,FCAM具有一定的优越性。
- Abstract:
-
Internet users often use sarcasm when discussing emergencies on social media, which complicates emotional analysis. In addition, there is a lack of research on multimodal comments, particularly those in Chinese, and their use of sarcasm on social media platforms. Therefore, it is necessary to delve deeper into sarcasm detection in multimodal Chinese content, specifically within images and text. To address this need, we propose a multimodal Chinese sarcasm detection model called the fuse cross-attention model (FCAM). This model incorporates a cross-attention mechanism to identify inconsistencies between modes. The text convolutional neural network (TextCNN) is used to extract basic features of Chinese text, while the deep residential network (ResNet) is used to extract image features. The cross-attention mechanism is used to obtain attention features from the text and image layers. The residual method is employed to establish a connection between the basic text features and the text layer’s attention features, as well as a link between the image features and the image layer’s attention features. These two feature representations are fused using the attention mechanism, resulting in the sarcasm classification results through the classification layer. We have constructed a multimodal Chinese sarcasm data set based on Weibo comment data related to the COVID-19 pandemic in a specific region. Experimental testing on this data set confirms that FCAM holds certain advantages over the benchmark model.
备注/Memo
收稿日期:2022-12-08。
基金项目:国家自然科学基金项目(72174079);江苏省“青蓝工程”优秀教学团队(2022-29).
作者简介:胡文彬,副教授,博士,中国计算机学会会员,江苏省人工智能学会会员,主要研究方向为社会网络隐私保护、智能信息处理。作为主要成员主持、参与完成省、市级项目5项,获省级教学成果奖1项,省级教育教学与研究成果奖1项。参与撰写专著1项,发表学术论文近20篇。E-mail:hwb1008@163.com;陈龙,硕士研究生,主要研究方向为自然语言处理、舆情分析。E-mail:956779521@qq.com;黄贤波,硕士研究生,主要研究方向为情感分析、舆情管控。E-mail:764157719@qq.com
通讯作者:胡文彬. E-mail:hwb1008@163.com
更新日期/Last Update:
1900-01-01