[1]邵煜潇,鲁涛,王震宇,等.结合多尺度大核卷积的红外图像人体检测算法[J].智能系统学报,2025,20(4):787-799.[doi:10.11992/tis.202404027]
SHAO Yuxiao,LU Tao,WANG Zhenyu,et al.Human detection algorithm in infrared images combining multi-scale large kernel convolution[J].CAAI Transactions on Intelligent Systems,2025,20(4):787-799.[doi:10.11992/tis.202404027]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
20
期数:
2025年第4期
页码:
787-799
栏目:
学术论文—机器学习
出版日期:
2025-08-05
- Title:
-
Human detection algorithm in infrared images combining multi-scale large kernel convolution
- 作者:
-
邵煜潇1, 鲁涛2, 王震宇1, 彭勇杰1, 姚巍1
-
1. 华北电力大学 控制与计算机工程学院, 北京 102206;
2. 中国科学院自动化研究所 多模态人工智能系统全国重点实验室, 北京 100190
- Author(s):
-
SHAO Yuxiao1, LU Tao2, WANG Zhenyu1, PENG Yongjie1, YAO Wei1
-
1. The School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China;
2. The State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Science, Beijing 100190, China
-
- 关键词:
-
红外图像; 目标检测; 重构注意力; 多尺度特征; 大核卷积; 卷积神经网络; 特征提取; 重参数化
- Keywords:
-
infrared image; object detection; reconstruction attention; multi-scale feature; large kernel convolution; convolutional neural network; feature extraction; re-parameterization
- 分类号:
-
TP391.4
- DOI:
-
10.11992/tis.202404027
- 文献标志码:
-
2025-2-24
- 摘要:
-
针对废墟环境下红外图像人体检测任务中存在的图像分辨率低且人体特征不明显的问题,基于YOLO框架设计了一种包含重参数化(re-parameterization)和多尺度大核卷积(multi-scale large kernel convolution)的红外图像人体检测网络RML-YOLO(re-parameterization multi-scale large kernel convolution)。该网络通过空间和通道重构注意力模块,将注意值集中到对检测任务更重要的区域。通过Sobel算子强化边缘特征,提高对不同姿态人体的检测能力。RML-YOLO的有效性在自制数据集上得到验证。在只有1.8×106可学习参数的情况下,模型的AP50和AP50-75分别达到了91.2%和87.3%,与参数量相近的YOLOv8-n相比分别提高了4.4%和5.3%。结果表明,RML-YOLO显著提高了利用红外图像进行废墟环境下人体检测的精度。
- Abstract:
-
Aiming at the problems of low image resolution and inconspicuous human features in the human detection task of infrared images under the ruins environment, an infrared image human detection network re-parameterization multi-scale large kernel convolution(RML-YOLO) is designed based on the YOLO framework, which includes re-parameterization and multi-scale large kernel convolution. The network, RML-YOLO, reconfigures the spatial and channel reconstruction attention module to focus on regions that are more important for the detection task. Edge features are strengthened by the Sobel operator to improve the detection ability of human with different poses. The validity of RML-YOLO is verified on a homegrown dataset. With only 1.8×106 learnable parameters, the AP50 and AP50-75 of the model reach 91.2% and 87.3%, respectively, which are improved by 4.4% and 5.3% compared with YOLOv8-n with similar number of parameters. The results show that RML-YOLO significantly improves the accuracy of human detection in the ruins environment using infrared images.
更新日期/Last Update:
1900-01-01