[1]LI Tao,GAO Zhigang,GUAN Shengyuan,et al.Global attention mechanism with real-time semantic segmentation network[J].CAAI Transactions on Intelligent Systems,2023,18(2):282-292.[doi:10.11992/tis.202208027]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
18
Number of periods:
2023 2
Page number:
282-292
Column:
学术论文—智能系统
Public date:
2023-05-05
- Title:
-
Global attention mechanism with real-time semantic segmentation network
- Author(s):
-
LI Tao1; 2; GAO Zhigang3; GUAN Shengyuan4; XU Jiucheng1; 2; MA Yuanyuan1
-
1. College of Computer and Information Engineering, He’nan Normal University, Xinxiang 453007, China;
2. Engineering Lab of He’nan Province for Intelligence Business & Internet of Things, Xinxiang 453007, China;
3. College of Software, He’nan Normal University, Xinxiang 453007, China;
4. National Security Academy, People’s Public Security University of China, Beijing 100038, China
-
- Keywords:
-
real-time semantic segmentation; global attention mechanism; multiscale feature fusion; hybrid dilated convolution; convolutional neural network; pyramid pooling; receptive field; feature extraction
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202208027
- Abstract:
-
The lightweight network structure cannot sufficiently extract effective semantic information from feature maps, and the unreasonable design of the semantic information and spatial detail information fusion block leads to a decrease in segmentation accuracy. To address these problems, a global attention mechanism with a real-time semantic segmentation network (GaSeNet) is proposed in the paper. First, a global attention mechanism is introduced into the semantic branch of the dual-branch structure. The convolutional neural network is then guided in the two dimensions of channel and space to focus on the semantic categories related to the segmentation task to extract remarkably effective semantic information. Second, a mixed hole convolution block is designed in the spatial detail branch, and the receptive field is enlarged while maintaining the size of the convolution kernel to obtain additional global spatial detail information and compensate for the loss of key feature information. The feature fusion module is then redesigned, and the deep aggregation pyramid pooling module is introduced to fuse feature maps of different scales comprehensively, thereby improving the semantic segmentation performance of the network. Finally, the proposed method is tested on CamVid and Vaihingen datasets. Compared with the latest semantic segmentation algorithm, GaSeNet improves the segmentation accuracy by 4.29% and 16.06%. Experimental results verify the effectiveness of this method in dealing with real-time semantic segmentation problems.