<-Previous Article Next Article->

[1]WANG Fengsui,CHEN Jingang,WANG Qisheng,et al.Multi-scale target detection algorithm based on adaptive context features[J].CAAI Transactions on Intelligent Systems,2022,17(2):276-285.[doi:10.11992/tis.202101029]

Copy

Multi-scale target detection algorithm based on adaptive context features

PDF Download HTML

CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume: 17 Number of periods: 2022 2 Page number: 276-285 Column: 学术论文—机器学习 Public date: 2022-03-05

Title:: Multi-scale target detection algorithm based on adaptive context features

Author(s):: WANG Fengsui¹; 2; 3; CHEN Jingang¹; 2; 3; WANG Qisheng¹; 2; 3; LIU Furong¹; 2; 3; 1. School of Electrical Engineering, Anhui Polytechnic University, Wuhu 241000, China;
2. Anhui Key Laboratory of Detection Technology and Energy Saving Devices, Wuhu 241000, China;
3. Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, Wuhu 241000, China

Keywords:: machine vision; target detection; convolution neural network; channel attention; parallel empty convolution; multi-scale feature fusion; contextual feature; deep learning

CLC:: TP391.4

DOI:: 10.11992/tis.202101029

Abstract:: Multi-scale target recognition is a challenge in any detection task. Aiming at the multi-scale problem in detection, a multi-scale target detection algorithm with adaptive context features is proposed. A multi-receptive field feature extraction network was constructed to solve the problem wherein targets of different scales require different receptive field features to be recognized. Using multi-branch parallel void convolution, contextual information in tags was extracted from high-level semantic features. Based on an improved channel attention mechanism, an adaptive feature fusion network was proposed to solve the problem wherein the semantic features of different scale targets appear in feature maps of different resolutions. The local location features were fused into global semantic features by learning the correlation between feature maps of different resolutions. The feature maps of different scales were used to identify objects of different scales. The proposed algorithm was verified on a Pascal VOC data set; the detection accuracy of the proposed method reached 85.74%, which was approximately 8.7% higher than the Faster R-CNN and about 2.06% higher than the baseline detection algorithm YOLOV3 +.

References:: [1] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137–1149.
[2] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016: 770-778.
[3] SINGH B, DAVIS L S. An analysis of scale invariance in object detection-SNIP[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 3578-3587.
[4] SINGH B, NAJIBI M, DAVIS L S. SNIPER: efficient multi-scale training[C]//Proceedings of the 32nd Conference on Neural Information Processing Systems. Montréal, Canada, 2018: 9333-9343.
[5] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017: 936-944.
[6] REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08)[2021-01-01].http://arxiv.org/abs/1804.02767.
[7] HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-14)[2021-01-01].http://arxiv.org/abs/1704.04861.
[8] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge[J]. International journal of computer vision, 2010, 88(2): 303–338.
[9] YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[C]//Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico, 2016.
[10] LI Yanghao, CHEN Yuntao, WANG Naiyan, et al. Scale-aware trident networks for object detection[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South), 2019: 6053-6062.
[11] HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(8): 2011–2023.
[12] WANG Qilong, WU Banggu, ZHU Pengfei, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA, 2020: 11531-11539.
[13] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016: 779-788.
[14] ZHANG Zhi, HE Tong, ZHANG Hang, et al. Bag of freebies for training object detection neural networks[EB/OL]. (2019-04-12)[2021-01-01]. http://arxiv.org/abs/1902.04103.
[15] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands, 2016: 21-37.
[16] FU Chengyang, LIU Wei, RANGA A, et al. DSSD: deconvolutional single shot detector[EB/OL]. (2017-01-23)[2021-01-01].http://arxiv.org/abs/1701.06659.
[17] DAI Jifeng, LI Yi, HE Kaiming, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain, 2016: 379-387.
[18] YUN S, HAN D, CHUN S, et al. CutMix: regularization strategy to train strong classifiers with localizable features[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South), 2019: 6022-6031.
[19] ZHANG Shifeng, WEN Longyin, BIAN Xiao, et al. Single-shot refinement neural network for object detection[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 4203-4212.
[20] LIU Songtao, HUANG Di, WANG Yunhong. Receptive field block net for accurate and fast object detection[C]//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany, 2018: 404-419.

Similar References:

Memo

Last Update: 1900-01-01

Multi-scale target detection algorithm based on adaptive context features PDF DownloadHTML

Memo

Multi-scale target detection algorithm based on adaptive context features

PDF Download HTML