[1]王凤随,陈金刚,王启胜,等.自适应上下文特征的多尺度目标检测算法[J].智能系统学报,2022,17(2):276-285.[doi:10.11992/tis.202101029]
WANG Fengsui,CHEN Jingang,WANG Qisheng,et al.Multi-scale target detection algorithm based on adaptive context features[J].CAAI Transactions on Intelligent Systems,2022,17(2):276-285.[doi:10.11992/tis.202101029]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
17
期数:
2022年第2期
页码:
276-285
栏目:
学术论文—机器学习
出版日期:
2022-03-05
- Title:
-
Multi-scale target detection algorithm based on adaptive context features
- 作者:
-
王凤随1,2,3, 陈金刚1,2,3, 王启胜1,2,3, 刘芙蓉1,2,3
-
1. 安徽工程大学 电气工程学院,安徽 芜湖 241000;
2. 检测技术与节能装置安徽省重点实验室,安徽 芜湖 241000;
3. 高端装备先进感知与智能控制教育部重点实验室,安徽 芜湖 241000
- Author(s):
-
WANG Fengsui1,2,3, CHEN Jingang1,2,3, WANG Qisheng1,2,3, LIU Furong1,2,3
-
1. School of Electrical Engineering, Anhui Polytechnic University, Wuhu 241000, China;
2. Anhui Key Laboratory of Detection Technology and Energy Saving Devices, Wuhu 241000, China;
3. Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, Wuhu 241000, China
-
- 关键词:
-
机器视觉; 目标检测; 卷积神经网络; 通道注意力; 并行空洞卷积; 多尺度特征融合; 上下文特征; 深度学习
- Keywords:
-
machine vision; target detection; convolution neural network; channel attention; parallel empty convolution; multi-scale feature fusion; contextual feature; deep learning
- 分类号:
-
TP391.4
- DOI:
-
10.11992/tis.202101029
- 摘要:
-
识别多尺度目标是检测任务中的一项挑战,针对检测中的多尺度问题,提出自适应上下文特征的多尺度目标检测算法。针对不同尺度的目标需要不同大小感受野特征进行识别的问题,构建了一种多感受野特征提取网络,通过多分支并行空洞卷积,从高层语义特征中挖掘标签中的上下文信息;针对不同尺度目标的语义特征出现在不同分辨率特征图中的问题,基于改进的通道注意力机制,提出自适应的特征融合网络,通过学习不同分辨率特征图之间的相关性,在全局语义特征中融合局部位置特征;利用不同尺度的特征图识别不同尺度的物体。在PASCAL VOC数据集上对本文算法进行验证,本文方法的检测精度达到了85.74%,相较于Faster R-CNN检测精度提升约8.7%,相较于基线检测算法YOLOv3+提升约2.06%。
- Abstract:
-
Multi-scale target recognition is a challenge in any detection task. Aiming at the multi-scale problem in detection, a multi-scale target detection algorithm with adaptive context features is proposed. A multi-receptive field feature extraction network was constructed to solve the problem wherein targets of different scales require different receptive field features to be recognized. Using multi-branch parallel void convolution, contextual information in tags was extracted from high-level semantic features. Based on an improved channel attention mechanism, an adaptive feature fusion network was proposed to solve the problem wherein the semantic features of different scale targets appear in feature maps of different resolutions. The local location features were fused into global semantic features by learning the correlation between feature maps of different resolutions. The feature maps of different scales were used to identify objects of different scales. The proposed algorithm was verified on a Pascal VOC data set; the detection accuracy of the proposed method reached 85.74%, which was approximately 8.7% higher than the Faster R-CNN and about 2.06% higher than the baseline detection algorithm YOLOV3 +.
更新日期/Last Update:
1900-01-01