[1]仲兆满,樊继冬,张渝,等.基于卷积交叉注意力与跨模态动态门控的多模态情感分析模型[J].智能系统学报,2025,20(4):999-1009.[doi:10.11992/tis.202409012]
ZHONG Zhaoman,FAN Jidong,ZHANG Yu,et al.Multimodal sentiment analysis model with convolutional cross-attention and cross-modal dynamic gating[J].CAAI Transactions on Intelligent Systems,2025,20(4):999-1009.[doi:10.11992/tis.202409012]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
20
期数:
2025年第4期
页码:
999-1009
栏目:
学术论文—机器学习
出版日期:
2025-08-05
- Title:
-
Multimodal sentiment analysis model with convolutional cross-attention and cross-modal dynamic gating
- 作者:
-
仲兆满1,2, 樊继冬1, 张渝1, 王晨1, 吕慧慧1, 张丽玲1
-
1. 江苏海洋大学 计算机工程学院, 江苏 连云港 222005;
2. 江苏省海洋资源开发研究院, 江苏 连云港 222005
- Author(s):
-
ZHONG Zhaoman1,2, FAN Jidong1, ZHANG Yu1, WANG Chen1, LYU Huihui1, ZHANG Liling1
-
1. School of Computer Engineering, Jiangsu Ocean University, Lianyungang 222005, China;
2. Jiangsu Institute of Marine Resources Development, Lianyungang 222005, China
-
- 关键词:
-
多模态融合; 情感分析; 情感关联性; 注意力机制; 卷积交叉注意力; 跨模态动态门控; 全局特征联合; 权重融合
- Keywords:
-
multimodal fusion; emotion analysis; emotional relevance; attention mechanism; convolutional cross attention; cross-modal dynamic gating; global feature association; weight fusion
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202409012
- 文献标志码:
-
2025-2-21
- 摘要:
-
在多模态情感分析任务中,现有方法由于忽视了图像与文本之间的情感关联性,导致融合特征存在大量冗余特征。为此,提出了一种基于卷积交叉注意力与跨模态动态门控的多模态情感分析模型(convolutional cross-attention and cross-modal dynamic gating, CCA-CDG)。CCA-CDG通过引入卷积交叉注意力模块(convolutional cross-attention module, CCAM) 来捕捉图像与文本间的一致性表达,获取图文之间的对齐特征;同时利用跨模态动态门控模块(cross-modal dynamic gating module, CDGM),根据图文之间的情感关联性动态调节情感特征的融合。此外,考虑到图文上下文信息对于理解情感的重要性,还设计了一个全局特征联合模块,将图文交互特征与全局特征权重融合,实现更可靠的情感预测。在MVSA-Single和MVSA-Multi数据集上进行实验验证,所提出的CCA-CDG能够有效改善多模态情感分析的效果。
- Abstract:
-
In multimodal sentiment analysis tasks, ignoring the emotional correlation between images and text leads to a large amount of redundant features in the fused representation. To mitigate this challenge, this paper introduces a multimodal sentiment analysis model grounded in convolutional cross-attention and cross-modal dynamic gating (CCA-CDG). The CCA-CDG model incorporates a convolutional cross-attention module to capture consistent expressions between images and text effectively, thereby obtaining aligned features. Furthermore, the model employs a cross-modal dynamic gating module to modulate the fusion of emotional features dynamically based on their interrelations across modalities. Additionally, recognizing the importance of contextual information from images and text for accurate sentiment interpretation, this paper devises a global feature fusion module that integrates interaction features with global feature weights, which leads to more reliable sentiment predictions. Experiments conducted on the MVSA-Single and MVSA-Multi datasets validate that the proposed CCA-CDG model remarkably enhances performance in multimodal sentiment analysis.
备注/Memo
收稿日期:2024-9-6。
基金项目:国家自然科学基金项目(72174079); 江苏省“青蓝工程”大数据优秀教学团队项目(2022-29).
作者简介:仲兆满,教授,江苏海洋大学计算机工程学院院长,中国矿业大学兼职博士生导师,主要研究方向为互联网舆情大数据分析及管控。主持国家自然科学基金面上项目1项,获中国自动化学会科技进步奖二等奖,发表学术论文50余篇,出版专著1部。E-mail: zhongzhaoman@163.com。;樊继冬,硕士研究生,主要研究方向为多模态情感分析、大数据采集与分析。E-mail:ffanjdong@163.com。;张渝,硕士研究生,主要研究方向为网络舆情分析、方面级情感分析。E-mail:zhou90616@gmail.com。
通讯作者:仲兆满. E-mail:zhongzhaoman@163.com
更新日期/Last Update:
1900-01-01