[1]邵煜潇,鲁涛,王震宇,等.结合多尺度大核卷积的红外图像人体检测算法[J].智能系统学报,2025,20(4):787-799.[doi:10.11992/tis.202404027]
 SHAO Yuxiao,LU Tao,WANG Zhenyu,et al.Human detection algorithm in infrared images combining multi-scale large kernel convolution[J].CAAI Transactions on Intelligent Systems,2025,20(4):787-799.[doi:10.11992/tis.202404027]
点击复制

结合多尺度大核卷积的红外图像人体检测算法

参考文献/References:
[1] 高荣伟. 人类应对“气候紧急状态”, 需快速强力行动[J]. 世界文化, 2021(4): 4-7.
GAO Rongwei. To cope with the “climate emergency”, human beings need to act quickly and forcefully[J]. World culture, 2021(4): 4-7.
[2] 郑学召, 杨卓瑞, 郭军, 等. 灾后救援生命探测仪的现状和发展趋势[J]. 工矿自动化, 2023, 49(6): 104-111.
ZHENG Xuezhao, YANG Zhuorui, GUO Jun, et al. The current status and development trend of post-disaster rescue life detectors[J]. Journal of mine automation, 2023, 49(6): 104-111.
[3] 苏卫华, 吴航, 张西正, 等. 救援机器人研究起源、发展历程与问题[J]. 军事医学, 2014, 38(12): 981-985.
SU Weihua, WU Hang, ZHANG Xizheng, et al. Rescue robot research: origin, development and future[J]. Military medical sciences, 2014, 38(12): 981-985.
[4] 曲海成, 王宇萍, 谢梦婷, 等. 结合亮度感知与密集卷积的红外与可见光图像融合[J]. 智能系统学报, 2022, 17(3): 643-652.
QU Haicheng, WANG Yuping, XIE Mengting, et al. Infrared and visible image fusion combined with brightness perception and dense convolution[J]. CAAI transactions on intelligent systems, 2022, 17(3): 643-652.
[5] 张铭津, 周楠, 李云松. 平滑交互式压缩网络的红外小目标检测算法[J]. 西安电子科技大学学报, 2024, 51(4): 1-14.
ZHANG Mingjin, ZHOU Nan, LI Yunsong. Smooth interactive compression network for infrared small target detection[J]. Journal of Xidian University, 2024, 51(4): 1-14.
[6] 吴一非, 杨瑞, 吕其深, 等. 红外与可见光图像融合: 统计分析, 深度学习方法和未来展望[J]. 激光与光电子学进展, 2024, 61(14): 42-60.
WU Yifei, YANG Rui, LYU Qishen, et al. Infrared and visible image fusion: statistical analysis, deep learning methods and future prospects[J]. Laser & optoelectronics progress, 2024, 61(14): 42-60.
[7] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[8] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
[9] REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08)[2024-04-01]. https://arxiv.org/abs/1804.02767.
[10] BOCHKOVSKIY A, WANG C Y, LIAO H M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23)[2024-04-01]. https://arxiv.org/abs/2004.10934.
[11] JOCHER G. Ultralytics YOLOv5[EB/OL]. (2022-11-22)[2024-04-01]. https://github.com/ultralytics/yolov5.
[12] JOCHER G, CHAURASIA A, QIU Jing. Ultralytics YOLOv8[EB/OL]. (2023-01-22) [2024-04-01]. https://github.com/ultralytics/ultralytics.
[13] LI Chuyin, LI Lu, JIANG Hongliang, et al. YOLOv6: a single-stage object detection framework for industrial applications[EB/OL]. (2022-09-07)[2024-04-01]. https://arxiv.org/abs/2209.02976.
[14] WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 7464-7475.
[15] HE Kaiming, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]//IEEE/CVF International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.
[16] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
[17] GIRSHICK R. Fast R-CNN[C]//IEEE/CVF International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.
[18] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137-1149.
[19] HU Jie, SHEN Li, SUN Gang. Squeeze-and-excitation networks[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.
[20] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//European Conference on Computer Vision. Munich: Springer, 2018: 3-19.
[21] LI Xiang, WANG Wenhai, HU Xiaolin, et al. Selective kernel networks[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 510-519.
[22] QIN Zequn, ZHANG Pengyi, WU Fei, et al. FcaNet: frequency channel attention networks[C]//IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 763-772.
[23] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//European Conference on Computer Vision. Zurich: Springer, 2014: 346-361.
[24] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 936-944.
[25] LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759-8768.
[26] CHEN Yuming, YUAN Xinbin, WANG Jiabao, et al. YOLO-MS: rethinking multi-scale representation learning for real-time object detection[J]. IEEE transactions on pattern analysis and machine intelligence, 2025, 47(6): 4240-4252.
[27] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//International Conference on Learning Representations. San Diego: ICLR, 2014.
[28] LIU Ze, LIN Yutong, CAO Yue, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 9992-10002.
[29] LIU Zhuang, MAO Hanzi, WU Chaoyuan, et al. A ConvNet for the 2020s[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 11966-11976.
[30] DING Xiaohan, ZHANG Xiangyu, MA Ningning, et al. RepVGG: making VGG-style ConvNets great again[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13733-13742.
[31] DING Xiaohan, ZHANG Xiangyu, HAN Jungong, et al. Scaling up your kernels to 31×31: revisiting large kernel design in CNNs[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 11953-11965.
[32] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[33] TAN Mingxing, PANG Ruoming, LE Q V. EfficientDet: scalable and efficient object detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10781-10790.
[34] GAO Shanghua, CHENG Mingming, ZHAO Kai, et al. Res2Net: a new multi-scale backbone architecture[J]. IEEE transactions on pattern analysis and machine intelligence, 2021, 43(2): 652-662.
[35] SANDLER M, HOWARD A, ZHU Menglong, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4510-4520.
[36] GE Zheng, LIU Songtao, WANG Feng, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. (2021-08-06)[2024-04-01]. https://arxiv.org/abs/2107.08430.
[37] 菲力尔. FLIR ONE Pro红外热像仪[EB/OL]. (2018-01-01)[2024-04-01]. https://www.flir.cn/products/flir-one-pro/?vertical=condition%20monitoring&segment=solutions.
TELEDYNE FILR. FLIR ONE prothermal imaging camera[EB/OL]. (2018-01-01)[2024-04-01]. https://www.flir.cn/products/flir-one-pro/?vertical=condition%20monitoring&segment=solutions.
[38] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]//European Conference on Computer Vision. Amsterdam: Springer, 2016: 21-37.
[39] LYU Chengqi, ZHANG Wenwei, HUANG Haian, et al. RTMDet: an empirical study of designing real-time object detectors[EB/OL]. (2022-12-16)[2024-04-01]. https://arxiv.org/abs/2212.07784.
[40] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]//IEEE/CVF International Conference on Computer Vision. Venice: IEEE, 2017: 618-626.
[41] JIA Xinyu, ZHU Chuang, LI Minzhen, et al. LLVIP: a visible-infrared paired dataset for low-light vision[C]//IEEE/CVF International Conference on Computer Vision Workshops. Montreal: IEEE, 2021: 3489-3497.
[42] LUO Wenjie, LI Yujia, URTASUN R, et al. Understanding the effective receptive field in deep convolutional neural networks[C]//Neural Information Processing Systems. Long Beach: MIT Press, 2016: 29.
相似文献/References:
[1]胡光龙,秦世引.动态成像条件下基于SURF和Mean shift的运动目标高精度检测[J].智能系统学报,2012,7(1):61.
 HU Guanglong,QIN Shiyin.High precision detection of a mobile object under dynamic imaging based on SURF and Mean shift[J].CAAI Transactions on Intelligent Systems,2012,7():61.
[2]韩峥,刘华平,黄文炳,等.基于Kinect的机械臂目标抓取[J].智能系统学报,2013,8(2):149.[doi:10.3969/j.issn.1673-4785.201212038]
 HAN Zheng,LIU Huaping,HUANG Wenbing,et al.Kinect-based object grasping by manipulator[J].CAAI Transactions on Intelligent Systems,2013,8():149.[doi:10.3969/j.issn.1673-4785.201212038]
[3]韩延彬,郭晓鹏,魏延文,等.RGB和HSI颜色空间的一种改进的阴影消除算法[J].智能系统学报,2015,10(5):769.[doi:10.11992/tis.201410010]
 HAN Yanbin,GUO Xiaopeng,WEI Yanwen,et al.An improved shadow removal algorithm based on RGB and HSI color spaces[J].CAAI Transactions on Intelligent Systems,2015,10():769.[doi:10.11992/tis.201410010]
[4]曾宪华,易荣辉,何姗姗.流形排序的交互式图像分割[J].智能系统学报,2016,11(1):117.[doi:10.11992/tis.201505037]
 ZENG Xianhua,YI Ronghui,HE Shanshan.Interactive image segmentation based on manifold ranking[J].CAAI Transactions on Intelligent Systems,2016,11():117.[doi:10.11992/tis.201505037]
[5]葛园园,许有疆,赵帅,等.自动驾驶场景下小且密集的交通标志检测[J].智能系统学报,2018,13(3):366.[doi:10.11992/tis.201706040]
 GE Yuanyuan,XU Youjiang,ZHAO Shuai,et al.Detection of small and dense traffic signs in self-driving scenarios[J].CAAI Transactions on Intelligent Systems,2018,13():366.[doi:10.11992/tis.201706040]
[6]莫宏伟,汪海波.基于Faster R-CNN的人体行为检测研究[J].智能系统学报,2018,13(6):967.[doi:10.11992/tis.201801025]
 MO Hongwei,WANG Haibo.Research on human behavior detection based on Faster R-CNN[J].CAAI Transactions on Intelligent Systems,2018,13():967.[doi:10.11992/tis.201801025]
[7]宁欣,李卫军,田伟娟,等.一种自适应模板更新的判别式KCF跟踪方法[J].智能系统学报,2019,14(1):121.[doi:10.11992/tis.201806038]
 NING Xin,LI Weijun,TIAN Weijuan,et al.Adaptive template update of discriminant KCF for visual tracking[J].CAAI Transactions on Intelligent Systems,2019,14():121.[doi:10.11992/tis.201806038]
[8]伍鹏瑛,张建明,彭建,等.多层卷积特征的真实场景下行人检测研究[J].智能系统学报,2019,14(2):306.[doi:10.11992/tis.201710019]
 WU Pengying,ZHANG Jianming,PENG Jian,et al.Research on pedestrian detection based on multi-layer convolution feature in real scene[J].CAAI Transactions on Intelligent Systems,2019,14():306.[doi:10.11992/tis.201710019]
[9]刘召,张黎明,耿美晓,等.基于改进的Faster R-CNN高压线缆目标检测方法[J].智能系统学报,2019,14(4):627.[doi:10.11992/tis.201905026]
 LIU Zhao,ZHANG Liming,GENG Meixiao,et al.Object detection of high-voltage cable based on improved Faster R-CNN[J].CAAI Transactions on Intelligent Systems,2019,14():627.[doi:10.11992/tis.201905026]
[10]单义,杨金福,武随烁,等.基于跳跃连接金字塔模型的小目标检测[J].智能系统学报,2019,14(6):1144.[doi:10.11992/tis.201905041]
 SHAN Yi,YANG Jinfu,WU Suishuo,et al.Skip feature pyramid network with a global receptive field for small object detection[J].CAAI Transactions on Intelligent Systems,2019,14():1144.[doi:10.11992/tis.201905041]

备注/Memo

收稿日期:2024-4-22。
作者简介:邵煜潇,硕士研究生,主要研究方向为计算机视觉和模式识别。E-mail:yx_shao@ncepu.edu.cn。;鲁涛,副研究员,主要研究方向为智能机器人控制、人机交互、操作技能学习、模仿学习。发表学术论文50余篇,授权国家发明专利20项。E-mail:tao.lu@ia.ac.cn。;王震宇,教授,博士生导师,主要研究方向为模式识别、计算机视觉。主持国家自然科学基金等科研项目5项,2019年获吴文俊人工智能科学技术奖。E-mail:zywang@ncepu.edu.cn。
通讯作者:王震宇. E-mail:zywang@ncepu.edu.cn

更新日期/Last Update: 1900-01-01
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com