[1]ZHAO Xuefeng,DI Hengxi,BAI Changze,et al.Multimodal aspect-based sentiment analysis combining multifaceted image feature extraction and gated fusion mechanism[J].CAAI Transactions on Intelligent Systems,2025,20(6):1461-1473.[doi:10.11992/tis.202503032]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
20
Number of periods:
2025 6
Page number:
1461-1473
Column:
学术论文—机器感知与模式识别
Public date:
2025-11-05
- Title:
-
Multimodal aspect-based sentiment analysis combining multifaceted image feature extraction and gated fusion mechanism
- Author(s):
-
ZHAO Xuefeng; DI Hengxi; BAI Changze; ZHONG Zhaoman; ZHONG Xiaomin
-
College of Computer Engineering, Jiangsu Ocean University, Lianyungang 222005, China
-
- Keywords:
-
global feature; multimodal; aspect-based sentiment analysis; text description; gating mechanism; cross attention; image-prompt; pre-trained language model
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202503032
- Abstract:
-
Existing multimodal aspect-based sentiment analysis models only extract single global image features, thereby overlooking key detailed information. To address this issue, this study proposes a network model that combines multifaceted image feature extraction and a gated fusion mechanism. Specifically, a multifaceted image feature extraction module is constructed in the proposed model. By leveraging cross-modal translation technology, textual descriptions of scenes, human faces, objects, and colors are generated from multiple sentiment-related dimensions of the image. This process achieves detailed information extraction and cross-modal information alignment. Furthermore, a gated fusion interaction module has been developed, incorporating a gating mechanism and interactive attention to facilitate efficient fusion and interaction between features. In order to address the representation gap across different modalities, sequence information is integrated with image prompts to convert image features into the input space of the pre-trained language model (PLM). This facilitates more accurate sentiment classification. Experiments conducted on the Twitter-2015 and Twitter-2017 datasets demonstrate that compared with existing models, the proposed model achieves an average improvement of 0.93% in accuracy and 0.52% in F1-score, effectively enhancing the performance of sentiment classification.