[1]程德强,马尚,寇旗旗,等.基于YOLOv4改进特征融合及全局感知的目标检测算法[J].智能系统学报,2024,19(2):325-334.[doi:10.11992/tis.202207018]
CHENG Deqiang,MA Shang,KOU Qiqi,et al.Target detection algorithm for improving feature fusion and global perception based on YOLOv4[J].CAAI Transactions on Intelligent Systems,2024,19(2):325-334.[doi:10.11992/tis.202207018]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
19
期数:
2024年第2期
页码:
325-334
栏目:
学术论文—机器感知与模式识别
出版日期:
2024-03-05
- Title:
-
Target detection algorithm for improving feature fusion and global perception based on YOLOv4
- 作者:
-
程德强1, 马尚1, 寇旗旗2, 张皓翔1, 钱建生1
-
1. 中国矿业大学 信息与控制工程学院, 江苏 徐州 221116;
2. 中国矿业大学 计算机科学与技术学院, 江苏 徐州 221116
- Author(s):
-
CHENG Deqiang1, MA Shang1, KOU Qiqi2, ZHANG Haoxiang1, QIAN Jiansheng1
-
1. School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China;
2. School of Computer Science & Technology, China University of Mining and Technology, Xuzhou 221116, China
-
- 关键词:
-
YOLOv4; 目标检测; 特征融合; 跨尺度; 多尺度变化; 全局注意力; 平均池化; 上下文信息
- Keywords:
-
YOLOv4; target detection; feature fusion; cross-scale; multiscale variation; global attention; average pooling; contextual information
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202207018
- 文献标志码:
-
2023-11-15
- 摘要:
-
YOLOv4算法在检测速度和精度上达到了很好的平衡,但仍存在着定位框不准确、检测率低的问题,尤其是在检测目标较小、尺度变化大的情况下。针对以上问题,提出一种新的基于YOLOv4改进的目标检测算法。该算法采用改进的特征融合模块(path aggregation network combined with bi-directional feature pyramid network,P-Bifpn)代替PANet(path aggregation network),增加跨尺度连接的同时在输出端引入权重,增强重要特征的表现力,解决由多尺度变化而引起的精度下降。然后,采用新的全局注意力机制(global association network,GANet),在减少平均池化与计算量的同时增强Sigmoid函数输出,加强模型对目标上下文关系的学习,减少噪声干扰和全局信息的损失。试验采用RSOD、NWPU VHR-10数据集,平均检测精度分别提升了约5%和3%;泛化试验采用VOC2007+2012公共数据集, 平均检测精度提升了约0.6%。试验结果表明改进的算法能够有效提高模型的检测能力。
- Abstract:
-
The YOLOv4 algorithm has a good balance in detection speed and accuracy, but there are still drawbacks of inaccurate positioning frame and low detection rate, especially for small detection targets and great changes in scale. Dealing with these problems, a new YOLOv4-based target detection algorithm is developed. The algorithm utilizes an enhanced feature fusion module—PANet combined with the bidirectional feature pyramid network instead of PANet to increase cross-scale connections, introduce weights at the output to improve the expressiveness of important features and solve accuracy degradation as a result of multiscale changes. Afterward, a new global association network is adopted to improve the output of the Sigmoid function while reducing the average pooling and computation, strengthen the model’s learning of the contextual relationship of the target, and reduce noise interference and global information loss. The RSOD and NWPU VHR-10 datasets are employed here, with average detection accuracies being enhanced by about 5% and 3%, respectively; the generalization experiment uses the VOC2007 + 2012 public dataset, with the average detection accuracy being enhanced by about 0.6%. The experimental results reveal that the improved algorithm can effectively enhance the detection ability of the model.
备注/Memo
收稿日期:2022-07-12。
基金项目:国家自然科学基金项目(52204177).
作者简介:程德强,教授,博士生导师,博士,主要研究方向为计算机视觉与模式识别、图像智能检测。主持国家自然科学基金项目3项,江苏省重大成果转化项目等省部级各类科技项目10余项。以第一作者(通信作者)发表学术论文70余篇。E-mail:chengdq@ cumt.edu.cn;马尚,硕士研究生,主要研究方向为图像处理与目标检测。E-mail:710584238@qq.com;寇旗旗,讲师,主要研究方向为视频、图像处理与模式识别。E-mail:137156449@qq.com
通讯作者:程德强. E-mail:chengdq@cumt.edu.cn
更新日期/Last Update:
1900-01-01