[1]ZHU Chaojie,YAN Yuming,CHU Baochang,et al.Aspect-level multimodal sentiment analysis via object-attention[J].CAAI Transactions on Intelligent Systems,2024,19(6):1562-1572.[doi:10.11992/tis.202404009]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
19
Number of periods:
2024 6
Page number:
1562-1572
Column:
人工智能院长论坛
Public date:
2024-12-05
- Title:
-
Aspect-level multimodal sentiment analysis via object-attention
- Author(s):
-
ZHU Chaojie1; YAN Yuming2; CHU Baochang2; LI Gang2; HUANG Heyan1; GAO Xiaoyan3
-
1. School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, China;
2. Beijing Huadian E-Commerce Technology Co., Ltd., Beijing 100073, China;
3. Faculty of Information Technology, Beijing University of Technology, Be
-
- Keywords:
-
aspect-level sentiment analysis; multimodal; sentiment analysis; object detection; self-attention; natural language processing systems; deep learning; feature extraction
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202404009
- Abstract:
-
Aspect-level multimodal sentiment analysis (ALMSA) aims to identify the sentiment polarity of a specific aspect word using both sentence and image data. Current models often rely on the global features of images, overlooking the details in the original image. To address this issue, we propose an object attention-based aspect-level multimodal sentiment analysis model (OAB-ALMSA). This model first employs an object detection algorithm to capture the detailed information of the objects from the original image. It then applies an object-attention mechanism and builds an iterative fusion layer to fully fuse the multimodal information. Finally, a curriculum learning strategy is developed to tackle the challenges of training with complex samples. Experiments conducted on TWITTER-2015 data sets demonstrate that OAB-ALMSA, when combined with curriculum learning, achieves the highest F1. These results highlight that leveraging detailed image data enhances the model’s overall understanding and improves prediction accuracy.