[1]JIANG Wentao,WANG Xinjie,ZHANG Shengchong.Spatially constrained attention mechanism for image classification network[J].CAAI Transactions on Intelligent Systems,2025,20(6):1444-1460.[doi:10.11992/tis.202505025]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
20
Number of periods:
2025 6
Page number:
1444-1460
Column:
学术论文—机器感知与模式识别
Public date:
2025-11-05
- Title:
-
Spatially constrained attention mechanism for image classification network
- Author(s):
-
JIANG Wentao1; WANG Xinjie1; ZHANG Shengchong2
-
1. School of Software, Liaoning Technical University, Huludao 125105, China;
2. Key Laboratory of Optoelectronic Information Control and Security Technology, Tianjin 300308, China
-
- Keywords:
-
image classification; spatially constrained attention mechanism; edge aware convolution; stochastic pooling; spatial information; edge features; feature fusion; residual network
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202505025
- Abstract:
-
This paper addresses two major issues in image classification networks: insufficient low-level feature extraction and inadequate spatial weighting of feature maps. A novel image classification network named SCAM-Net (spatially constrained attention Mechanism for Image Classification Network) is proposed. SCAM-Net is built upon the WideResNet-28-10 architecture. First, a Spatial-Constrained Attention (SCA) mechanism is introduced. By incorporating a spatial constraint strategy and a dynamic weighting approach, SCA significantly enhances the network’s ability to perceive spatial positions in feature maps. This enables the model to focus more precisely on critical regions and improves the quality of feature representation, leading to better discrimination in complex scenarios. Second, an Edge-Aware Convolution (EAConv) is developed. EAConv integrates Sobel operators with convolutions of multiple kernel sizes to capture multi-level edge information, thereby compensating for the weak edge feature extraction capability in the original first convolutional layer. Experimental results demonstrate that SCAM-Net outperforms the baseline WideResNet-28-10 by 2.43%, 0.93%, 0.14%, and 0.91% on CIFAR-100, CIFAR-10, SVHN, and GTSRB datasets, respectively. Compared with the second-best model QKFormer, SCAM-Net achieves 0.13%, 0.10%, 0.12%, and 0.34% higher classification accuracy on the same datasets. These results confirm that the collaboration between the spatial-constrained attention mechanism and the edge-aware convolution allows SCAM-Net to better capture fine-grained visual details and effectively improve image classification performance.