[1]曲海成,徐波.基于自适应图学习权重的多模态情感分析[J].智能系统学报,2025,20(2):516-528.[doi:10.11992/tis.202401001]
QU Haicheng,XU Bo.Multimodal sentiment analysis based on adaptive graph learning weight[J].CAAI Transactions on Intelligent Systems,2025,20(2):516-528.[doi:10.11992/tis.202401001]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
20
期数:
2025年第2期
页码:
516-528
栏目:
人工智能院长论坛
出版日期:
2025-03-05
- Title:
-
Multimodal sentiment analysis based on adaptive graph learning weight
- 作者:
-
曲海成, 徐波
-
辽宁工程技术大学 软件学院, 辽宁 葫芦岛 125105
- Author(s):
-
QU Haicheng, XU Bo
-
School of Software, Liaoning Technical University, Huludao 125105, China
-
- 关键词:
-
多模态; 情感分析; 模态差异性; 信息冗余; 自适应图学习; 跨模态注意力; 相似性约束; 信息瓶颈
- Keywords:
-
multimodal; sentiment analysis; modal differences; information redundancy; adaptive graph learning; cross modal attention; similarity constraints; information bottleneck
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202401001
- 摘要:
-
在多模态情感分析任务中,由于不同模态表现方式的不一致性,模态间的情感信息密度具有较大的差异。为了平衡情感信息在不同模态中分布的不均匀性并减少多模态特征表示的冗余性,提出了一种基于自适应图学习权重的多模态情感分析方法。首先,采用不同的特征提取方法捕获单一模态内的特定信息;其次,将不同模态通过公共编码器映射到同一空间中,利用跨模态注意力机制来显式构建模态间的关联;然后,将每种模态对任务分类的预测值以及模态表示嵌入到自适应图中,通过模态标签学习不同模态对最终分类任务的贡献度来动态调整不同模态之间的权重,以适应主导模态的变化;最后,引入信息瓶颈机制进行去噪,旨在学习一种无冗余的多模态特征表示进行情感预测。在公开的多模态情感分析数据集上对所提出的模型进行了评估。实验结果表明,其有效提升了多模态情感分析的准确性。
- Abstract:
-
The inconsistency in representing different modalities in multimodal sentiment analysis tasks results in significant differences in the density of emotional information between modalities. A multimodal sentiment analysis method based on adaptive graph learning weights is proposed to balance the uneven distribution of emotional information in different modalities and reduce the redundancy of multimodal feature representations. First, different feature extraction methods are used to capture specific information within each mode. Second, different modalities are mapped to the same space through a common encoder, and cross-modal attention mechanisms are used to explicitly construct correlations between modalities. Third, the predicted values and modal representations of each modality for task classification are embedded into the adaptive graph, and the contribution of different modalities to the final classification task is learned through modal labels to dynamically adjust the weights between different modalities for adapting to changes in the dominant modality. Finally, an information bottleneck mechanism is introduced for denoising, aiming to learn a nonredundant multimodal feature representation for sentiment prediction. The proposed model is evaluated on the publicly available multimodal sentiment analysis datasets. Experimental results show that its effectively improving the accuracy of multimodal sentiment analysis.
更新日期/Last Update:
2025-03-05