<-上一篇/Previous Article 下一篇/Next Article->

[1]程德强,马尚,寇旗旗,等.基于YOLOv4改进特征融合及全局感知的目标检测算法[J].智能系统学报,2024,19(2):325-334.[doi:10.11992/tis.202207018]
　CHENG Deqiang,MA Shang,KOU Qiqi,et al.Target detection algorithm for improving feature fusion and global perception based on YOLOv4[J].CAAI Transactions on Intelligent Systems,2024,19(2):325-334.[doi:10.11992/tis.202207018]

点击复制

基于YOLOv4改进特征融合及全局感知的目标检测算法

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 19 期数: 2024年第2期页码: 325-334 栏目: 学术论文—机器感知与模式识别出版日期: 2024-03-05

Title:: Target detection algorithm for improving feature fusion and global perception based on YOLOv4

作者:: 程德强¹, 马尚¹, 寇旗旗², 张皓翔¹, 钱建生¹; 1. 中国矿业大学信息与控制工程学院, 江苏徐州 221116;
2. 中国矿业大学计算机科学与技术学院, 江苏徐州 221116

Author(s):: CHENG Deqiang¹, MA Shang¹, KOU Qiqi², ZHANG Haoxiang¹, QIAN Jiansheng¹; 1. School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China;
2. School of Computer Science & Technology, China University of Mining and Technology, Xuzhou 221116, China

关键词:: YOLOv4; 目标检测; 特征融合; 跨尺度; 多尺度变化; 全局注意力; 平均池化; 上下文信息

Keywords:: YOLOv4; target detection; feature fusion; cross-scale; multiscale variation; global attention; average pooling; contextual information

分类号:: TP391

DOI:: 10.11992/tis.202207018

文献标志码:: 2023-11-15

摘要:: YOLOv4算法在检测速度和精度上达到了很好的平衡，但仍存在着定位框不准确、检测率低的问题，尤其是在检测目标较小、尺度变化大的情况下。针对以上问题，提出一种新的基于YOLOv4改进的目标检测算法。该算法采用改进的特征融合模块（path aggregation network combined with bi-directional feature pyramid network，P-Bifpn）代替PANet（path aggregation network），增加跨尺度连接的同时在输出端引入权重，增强重要特征的表现力，解决由多尺度变化而引起的精度下降。然后，采用新的全局注意力机制（global association network，GANet），在减少平均池化与计算量的同时增强Sigmoid函数输出，加强模型对目标上下文关系的学习，减少噪声干扰和全局信息的损失。试验采用RSOD、NWPU VHR-10数据集，平均检测精度分别提升了约5%和3%；泛化试验采用VOC2007+2012公共数据集, 平均检测精度提升了约0.6%。试验结果表明改进的算法能够有效提高模型的检测能力。

Abstract:: The YOLOv4 algorithm has a good balance in detection speed and accuracy, but there are still drawbacks of inaccurate positioning frame and low detection rate, especially for small detection targets and great changes in scale. Dealing with these problems, a new YOLOv4-based target detection algorithm is developed. The algorithm utilizes an enhanced feature fusion module—PANet combined with the bidirectional feature pyramid network instead of PANet to increase cross-scale connections, introduce weights at the output to improve the expressiveness of important features and solve accuracy degradation as a result of multiscale changes. Afterward, a new global association network is adopted to improve the output of the Sigmoid function while reducing the average pooling and computation, strengthen the model’s learning of the contextual relationship of the target, and reduce noise interference and global information loss. The RSOD and NWPU VHR-10 datasets are employed here, with average detection accuracies being enhanced by about 5% and 3%, respectively; the generalization experiment uses the VOC2007 + 2012 public dataset, with the average detection accuracy being enhanced by about 0.6%. The experimental results reveal that the improved algorithm can effectively enhance the detection ability of the model.

参考文献/References:: [1] 程德强, 李腾腾, 郭昕, 等. 改进的SIFT邻域投票图像匹配算法[J]. 计算机工程与设计, 2020, 41(1): 162–168
CHENG Deqiang, LI Tengteng, GUO Xin, et al. Improved SIFT neighborhood voting image matching algorithm[J]. Computer engineering and design, 2020, 41(1): 162–168
[2] CHENG Deqiang, TANG Shixuan, FENG Chenchen, et al. Extended HOG-CLBC for pedstrain detection[J]. Opto-electronic engineer, 2018, 45(8): 180111.
[3] 张桂梅, 张松, 储珺. 一种新的基于局部轮廓特征的目标检测方法[J]. 自动化学报, 2014, 40(10): 2346–2355
ZHANG Guimei, ZHANG Song, CHU Jun. A new object detection algorithm using local contour features[J]. Acta automatica sinica, 2014, 40(10): 2346–2355
[4] 王彦情, 马雷, 田原. 光学遥感图像舰船目标检测与识别综述[J]. 自动化学报, 2011, 37(9): 1029–1039
WANG Yanqing, MA Lei, TIAN Yuan. State-of-the-art of ship detection and recognition in optical remotely sensed imagery[J]. Acta automatica sinica, 2011, 37(9): 1029–1039
[5] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: ACM, 2014: 580-587.
[6] CHE Xiangjiu, LIU Hualuo, SHAO Qingbin. Fabric defect recognition algorithm based on improved Fast RCNN[J]. Journal of Jilin University (Engineering and Technology Edition), 2019, 49(6): 2038–2044.
[7] 黄继鹏, 史颖欢, 高阳. 面向小目标的多尺度Faster-RCNN检测算法[J]. 计算机研究与发展, 2019, 56(2): 319–327
HUANG Jipeng, SHI Yinghuan, GAO Yang. Multi-scale faster-RCNN algorithm for small object detection[J]. Journal of computer research and development, 2019, 56(2): 319–327
[8] SONG Ling, XIA Zhimin. Research on improved mask R-CNN network model for human keypoint detection[J]. Computer engineering and applications, 2021, 57(1): 150–160.
[9] 刘学平, 李玙乾, 刘励, 等. 自适应边缘优化的改进YOLOV3目标识别算法[J]. 微电子学与计算机, 2019, 36(7): 59–64
LIU Xueping, LI Yuqian, LIU Li, et al. Improved YOLOV3 target recognition algorithm for adaptive edge optimization[J]. Microelectronics & computer, 2019, 36(7): 59–64
[10] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
[11] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]//European Conference on Computer Vision. Cham: Springer, 2016: 21-37.
[12] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(2): 318–327.
[13] REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08)[2022-01-01]. https://arxiv.org/abs/1804.02767.
[14] BOCHKOVSKIY A, WANG Chienyao, LIAO Hongyuan. YOLOv4: Optimal Speed and Accuracy of Object Detection[EB/OL]. (2020-04-23)[2022-01-01]. https://arxiv.org/abs/2004.10934.
[15] LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759-8768.
[16] GE Zheng, LIU Songtao, WANG Feng, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. (2021-07-18)[2022-01-01]. https://arxiv.org/abs/2107.08430.pdf.
[17] ZHANG Xiangyu, ZHOU Xinyu, LIN Mengxiao, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6848-6856.
[18] CHENG Gong, HAN Junwei, ZHOU Peicheng, et al. Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J]. ISPRS journal of photogrammetry and remote sensing, 2014, 98: 119–132.
[19] AGGARWAL V, WANG Wenlin, ERIKSSON B, et al. Wide compression: tensor ring nets[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 9329-9338.
[20] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904–1916.
[21] LIN T Y, DOLL?R P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 936-944.
[22] TAN Mingxing, PANG Ruoming, LE Q V. EfficientDet: scalable and efficient object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10778-10787.
[23] LIU Mingjie, WANG Xianhao, ZHOU Anjian, et al. UAV-YOLO: small object detection on unmanned aerial vehicle perspective[J]. Sensors, 2020, 20(8): 2238.
[24] 王凤随, 陈金刚, 王启胜, 等. 自适应上下文特征的多尺度目标检测算法[J]. 智能系统学报, 2022, 17(2): 276–285
WANG Fengsui, CHEN Jingang, WANG Qisheng, et al. Multi-scale target detection algorithm based on adaptive context features[J]. CAAI transactions on intelligent systems, 2022, 17(2): 276–285
[25] 赵文清, 杨盼盼. 双向特征融合与注意力机制结合的目标检测[J]. 智能系统学报, 2021, 16(6): 1098–1105
ZHAO Wenqing, YANG Panpan. Target detection based on bidirectional feature fusion and an attention mechanism[J]. CAAI transactions on intelligent systems, 2021, 16(6): 1098–1105
[26] WANG Hao, WANG Qilong, GAO Mingqi, et al. Multi-scale location-aware kernel representation for object detection[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 1248-1257.
[27] HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(8): 2011–2023.
[28] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C] // Proc of the 15th European Conference on Computer Vision. Munich: Springer, 2018: 3-19
[29] TIAN Zhuoyu, MA Miao, YANG Kaifang. Object detection model for examination classroom based on cascade attention and point supervision mechanism[J]. Journal of software, 2022, 33(7): 2633–2645.
[30] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all You need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010.
[31] WANG Xiaolong, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7794-7803.

相似文献/References:: [1]胡光龙,秦世引.动态成像条件下基于SURF和Mean shift的运动目标高精度检测[J].智能系统学报,2012,7(1):61.
　HU Guanglong,QIN Shiyin.High precision detection of a mobile object under dynamic imaging based on SURF and Mean shift[J].CAAI Transactions on Intelligent Systems,2012,7():61.
[2]韩峥,刘华平,黄文炳,等.基于Kinect的机械臂目标抓取[J].智能系统学报,2013,8(2):149.[doi:10.3969/j.issn.1673-4785.201212038]
　HAN Zheng,LIU Huaping,HUANG Wenbing,et al.Kinect-based object grasping by manipulator[J].CAAI Transactions on Intelligent Systems,2013,8():149.[doi:10.3969/j.issn.1673-4785.201212038]
[3]韩延彬,郭晓鹏,魏延文,等.RGB和HSI颜色空间的一种改进的阴影消除算法[J].智能系统学报,2015,10(5):769.[doi:10.11992/tis.201410010]
　HAN Yanbin,GUO Xiaopeng,WEI Yanwen,et al.An improved shadow removal algorithm based on RGB and HSI color spaces[J].CAAI Transactions on Intelligent Systems,2015,10():769.[doi:10.11992/tis.201410010]
[4]曾宪华,易荣辉,何姗姗.流形排序的交互式图像分割[J].智能系统学报,2016,11(1):117.[doi:10.11992/tis.201505037]
　ZENG Xianhua,YI Ronghui,HE Shanshan.Interactive image segmentation based on manifold ranking[J].CAAI Transactions on Intelligent Systems,2016,11():117.[doi:10.11992/tis.201505037]
[5]葛园园,许有疆,赵帅,等.自动驾驶场景下小且密集的交通标志检测[J].智能系统学报,2018,13(3):366.[doi:10.11992/tis.201706040]
　GE Yuanyuan,XU Youjiang,ZHAO Shuai,et al.Detection of small and dense traffic signs in self-driving scenarios[J].CAAI Transactions on Intelligent Systems,2018,13():366.[doi:10.11992/tis.201706040]
[6]莫宏伟,汪海波.基于Faster R-CNN的人体行为检测研究[J].智能系统学报,2018,13(6):967.[doi:10.11992/tis.201801025]
　MO Hongwei,WANG Haibo.Research on human behavior detection based on Faster R-CNN[J].CAAI Transactions on Intelligent Systems,2018,13():967.[doi:10.11992/tis.201801025]
[7]宁欣,李卫军,田伟娟,等.一种自适应模板更新的判别式KCF跟踪方法[J].智能系统学报,2019,14(1):121.[doi:10.11992/tis.201806038]
　NING Xin,LI Weijun,TIAN Weijuan,et al.Adaptive template update of discriminant KCF for visual tracking[J].CAAI Transactions on Intelligent Systems,2019,14():121.[doi:10.11992/tis.201806038]
[8]伍鹏瑛,张建明,彭建,等.多层卷积特征的真实场景下行人检测研究[J].智能系统学报,2019,14(2):306.[doi:10.11992/tis.201710019]
　WU Pengying,ZHANG Jianming,PENG Jian,et al.Research on pedestrian detection based on multi-layer convolution feature in real scene[J].CAAI Transactions on Intelligent Systems,2019,14():306.[doi:10.11992/tis.201710019]
[9]刘召,张黎明,耿美晓,等.基于改进的Faster R-CNN高压线缆目标检测方法[J].智能系统学报,2019,14(4):627.[doi:10.11992/tis.201905026]
　LIU Zhao,ZHANG Liming,GENG Meixiao,et al.Object detection of high-voltage cable based on improved Faster R-CNN[J].CAAI Transactions on Intelligent Systems,2019,14():627.[doi:10.11992/tis.201905026]
[10]单义,杨金福,武随烁,等.基于跳跃连接金字塔模型的小目标检测[J].智能系统学报,2019,14(6):1144.[doi:10.11992/tis.201905041]
　SHAN Yi,YANG Jinfu,WU Suishuo,et al.Skip feature pyramid network with a global receptive field for small object detection[J].CAAI Transactions on Intelligent Systems,2019,14():1144.[doi:10.11992/tis.201905041]

备注/Memo

收稿日期:2022-07-12。
基金项目:国家自然科学基金项目（52204177）.
作者简介:程德强，教授，博士生导师，博士，主要研究方向为计算机视觉与模式识别、图像智能检测。主持国家自然科学基金项目3项，江苏省重大成果转化项目等省部级各类科技项目10余项。以第一作者（通信作者）发表学术论文70余篇。E-mail：chengdq@ cumt.edu.cn;马尚，硕士研究生，主要研究方向为图像处理与目标检测。E-mail：710584238@qq.com;寇旗旗，讲师，主要研究方向为视频、图像处理与模式识别。E-mail：137156449@qq.com
通讯作者:程德强. E-mail：chengdq@cumt.edu.cn

更新日期/Last Update: 1900-01-01

基于YOLOv4改进特征融合及全局感知的目标检测算法 PDF下载HTML

备注/Memo

基于YOLOv4改进特征融合及全局感知的目标检测算法

PDF下载 HTML