<-上一篇/Previous Article 下一篇/Next Article->

[1]梁礼明,冯耀,龙鹏威,等.基于MobileViT和多尺度特征聚合的遥感图像目标检测[J].智能系统学报,2024,19(5):1168-1177.[doi:10.11992/tis.202310022]
　LIANG Liming,FENG Yao,LONG Pengwei,et al.Remote sensing image object detection based on MobileViT and multiscale feature aggregation[J].CAAI Transactions on Intelligent Systems,2024,19(5):1168-1177.[doi:10.11992/tis.202310022]

点击复制

基于MobileViT和多尺度特征聚合的遥感图像目标检测

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 19 期数: 2024年第5期页码: 1168-1177 栏目: 学术论文—机器感知与模式识别出版日期: 2024-09-05

Title:: Remote sensing image object detection based on MobileViT and multiscale feature aggregation

作者:: 梁礼明, 冯耀, 龙鹏威, 李仁杰; 江西理工大学电气工程与自动化学院, 江西赣州 341000

Author(s):: LIANG Liming, FENG Yao, LONG Pengwei, LI Renjie; School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China

关键词:: 深度学习; 遥感图像; 目标检测; YOLOv7-tiny; MobileViT模块; 多尺度特征融合; 上下文信息; Wise-IoU

Keywords:: deep learning; remote sensing image; object detection; YOLOv7-tiny; MobileViT module; multi-scale feature fusion; contextual information; Wise-IoU

分类号:: TP391

DOI:: 10.11992/tis.202310022

文献标志码:: 2024-08-28

摘要:: 针对遥感图像目标检测存在复杂背景干扰、微小目标提取难和目标多尺度差异问题，提出一种基于MobileViT和多尺度特征聚合的遥感图像目标检测算法(FWM-YOLOv7t)。首先设计多尺度特征聚合模块，建立遥感目标上下文依赖关系，提升多尺度目标和小目标检测精度；然后利用MobileViT模块，融合卷积神经网络和视觉Transformer优点，有效编码局部和全局信息，抑制非目标噪声干扰；最后引入Wise-IoU损失函数，重点关注普通质量锚框，提高算法检测性能。在公共数据集RSOD和NWPU VHR-10上的实验结果表明，FWM-YOLOv7t能够显著提升遥感图像目标检测的平均准确率。与其他目标检测算法相比，FWM-YOLOv7t对复杂背景目标、小目标和多尺度目标的检测更有效。

Abstract:: A new algorithm is proposed based on MobileViT and multi-scale feature aggregation (referred to as FWM-YOLOv7t) to address problems such as complex background interference, difficulty in extracting small objects, and object multi-scale differences in remote sensing image object detection. First, we design a multi-scale feature aggregation module to establish context dependencies for remote sensing targets, which improves the accuracy of detecting multi-scale and small targets. Then, we utilize the MobileViT module to fuse the advantages of convolutional neural networks and vision transformers for effective local and global information encoding to suppress non-target noise interference. Finally, we introduce the Wise-IoU loss function, which focuses on ordinary quality anchor boxes to enhance the detection performance of the algorithm. Experimental evaluations on the public RSOD and NWPU VHR-10 dataset demonstrate that FWM-YOLOv7t can significantly improve the average accuracy of remote sensing image target detection. Furthermore, compared with other object detection algorithms, the FWM-YOLOv7t algorithm exhibits superior effectiveness in detecting complex, small, and multiscale objects in remote sensing imagery.

参考文献/References:: [1] 赵文清, 康怿瑾, 赵振兵, 等. 改进YOLOv5s的遥感图像目标检测[J]. 智能系统学报, 2023, 18(1): 86-95.
ZHAO Wenqing, KANG Yijin, ZHAO Zhenbing, et al. A remote sensing image object detection algorithm with improved YOLOv5s[J]. CAAI transactions on intelligent systems, 2023, 18(1): 86-95.
[2] MING Qi, MIAO Lingjuan, ZHOU Zhiqiang, et al. CFC-net: a critical feature capturing network for arbitrary-oriented object detection in remote-sensing images[J]. IEEE transactions on geoscience and remote sensing, 2022, 60: 5605814.
[3] CONG Runmin, ZHANG Yumo, FANG Leyuan, et al. RRNet: relational reasoning network with parallel multiscale attention for salient object detection in optical remote sensing images[J]. IEEE transactions on geoscience and remote sensing, 2021, 60: 5613311.
[4] SEDAGHAT A, EBADI H. Remote sensing image matching based on adaptive binning SIFT descriptor[J]. IEEE transactions on geoscience and remote sensing, 2015, 53(10): 5283-5293.
[5] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego: IEEE, 2005: 886-893.
[6] 吴珺, 董佳明, 刘欣, 等. 注意力优化的轻量目标检测网络及应用[J]. 智能系统学报, 2023, 18(3): 506-516.
WU Jun, DONG Jiaming, LIU Xin, et al. Lightweight object detection network and its application based on the attention optimization[J]. CAAI transactions on intelligent systems, 2023, 18(3): 506-516.
[7] 梁礼明, 詹涛, 雷坤, 等. 多分辨率融合输入的U型视网膜血管分割算法[J]. 电子与信息学报, 2023, 45(5): 1795-1806.
LIANG Liming, ZHAN Tao, LEI Kun, et al. Multi-resolution fusion input U-shaped retinal vessel segmentation algorithm[J]. Journal of electronics & information technology, 2023, 45(5): 1795-1806.
[8] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
[9] GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.
[10] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137-1149.
[11] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]//European Conference on Computer Vision. Cham: Springer, 2016: 21-37.
[12] FARHADI A, REDMON J. YOLOv3: an incremental improvement[C]//2018 IEEE/CVF Conference on Compu- ter Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 1804-2767.
[13] BOCHKOVSKIY A, WANG C Y, LIAO H M, et al. YOLOv4: optimal speed and accuracy of object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 2-7.
[14] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 7464-7475.
[15] 吴萌萌, 张泽斌, 宋尧哲, 等. 基于自适应特征增强的小目标检测网络[J]. 激光与光电子学进展, 2023, 60(6): 65-72.
WU Mengmeng, ZHANG Zebin, SONG Yaozhe, et al. Small-target detection network based on adaptive feature enhancement[J]. Laser & optoelectronics progress, 2023, 60(6): 65-72.
[16] 李美霖, 芮杰, 金飞, 等. 基于改进YOLOX的遥感影像目标检测算法[J]. 吉林大学学报(地球科学版), 2023, 53(4): 1313-1322.
LI Meilin, RUI Jie, JIN Fei, et al. Remote sensing image target detection algorithm based on improved YOLOX[J]. Journal of Jilin University (earth science edition), 2023, 53(4): 1313-1322.
[17] WANG Xin, HE Ning, HONG Chen, et al. Improved YOLOX-X based UAV aerial photography object detection algorithm[J]. Image and vision computing, 2023, 135: 104697.
[18] AKYON F C, ONUR ALTINUC S, TEMIZEL A. Slicing aided hyper inference and fine-tuning for small object detection[C]//2022 IEEE International Conference on Image Processing. Bordeaux: IEEE, 2022: 966-970.
[19] 梁礼明, 何安军, 朱晨锟, 等. 融合Transformer和跨级相位感知的结肠息肉分割方法[J]. 生物医学工程学杂志, 2023, 40(2): 234-243.
LIANG Liming, HE Anjun, ZHU Chenkun, et al. Colorectal polyp segmentation method based on fusion of transformer and cross-level phase awareness[J]. Journal of biomedical engineering, 2023, 40(2): 234-243.
[20] ZHU Xingkui, LYU Shuchang, WANG Xu, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal: IEEE, 2021: 2778-2788.
[21] SAHIN O, OZER S. YOLODrone: improved YOLO architecture for object detection in UAV images[C]//2022 30th Signal Processing and Communications Applications Conference. Safranbolu: IEEE, 2022: 1-4.
[22] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[C]//International Conference on Learning Representations. NewOrleans: ICLR, 2021: 1-22.
[23] MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer[EB/OL]. (2021-10-05)[2023-10-17]. https://arxiv.org/abs/2110.02178.
[24] ZHENG Zhaohui, WANG Ping, LIU Wei, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993-13000.
[25] TONG Zanjia, CHEN Yuhang, XU Zewei, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[EB/OL]. (2023-01-24)[2023-10-17]. https://arxiv.org/abs/2301.10051.
[26] LONG Yang, GONG Yiping, XIAO Zhifeng, et al. Accurate object localization in remote sensing images based on convolutional neural networks[J]. IEEE transactions on geoscience and remote sensing, 2017, 55(5): 2486-2498.
[27] CHENG Gong, ZHOU Peicheng, HAN Junwei. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE transactions on geoscience and remote sensing, 2016, 54(12): 7405-7415.

相似文献/References:: [1]刘富,于鹏,刘坤.采用独立分量分析Zernike矩的遥感图像飞机目标识别[J].智能系统学报,2011,6(1):51.
　LIU Fu,YU Peng,LIU Kun.Research concerning aircraft recognition of remote sensing images based on ICA Zernike invariant moments[J].CAAI Transactions on Intelligent Systems,2011,6():51.
[2]张媛媛,霍静,杨婉琪,等.深度信念网络的二代身份证异构人脸核实算法[J].智能系统学报,2015,10(2):193.[doi:10.3969/j.issn.1673-4785.201405060]
　ZHANG Yuanyuan,HUO Jing,YANG Wanqi,et al.A deep belief network-based heterogeneous face verification method for the second-generation identity card[J].CAAI Transactions on Intelligent Systems,2015,10():193.[doi:10.3969/j.issn.1673-4785.201405060]
[3]丁科,谭营.GPU通用计算及其在计算智能领域的应用[J].智能系统学报,2015,10(1):1.[doi:10.3969/j.issn.1673-4785.201403072]
　DING Ke,TAN Ying.A review on general purpose computing on GPUs and its applications in computational intelligence[J].CAAI Transactions on Intelligent Systems,2015,10():1.[doi:10.3969/j.issn.1673-4785.201403072]
[4]龙海侠,吴淑雷,吕雁.基于多样性变异的QPSO算法的遥感图像分类[J].智能系统学报,2015,10(6):938.[doi:10.11992/tis.201507045]
　LONG Haixia,WU Shulei,LYU Yan.Classification of multispectral remote sensing image based on QPSO and diversity-mutation[J].CAAI Transactions on Intelligent Systems,2015,10():938.[doi:10.11992/tis.201507045]
[5]马晓,张番栋,封举富.基于深度学习特征的稀疏表示的人脸识别方法[J].智能系统学报,2016,11(3):279.[doi:10.11992/tis.201603026]
　MA Xiao,ZHANG Fandong,FENG Jufu.Sparse representation via deep learning features based face recognition method[J].CAAI Transactions on Intelligent Systems,2016,11():279.[doi:10.11992/tis.201603026]
[6]刘帅师,程曦,郭文燕,等.深度学习方法研究新进展[J].智能系统学报,2016,11(5):567.[doi:10.11992/tis.201511028]
　LIU Shuaishi,CHENG Xi,GUO Wenyan,et al.Progress report on new research in deep learning[J].CAAI Transactions on Intelligent Systems,2016,11():567.[doi:10.11992/tis.201511028]
[7]马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报,2016,11(6):728.[doi:10.11992/tis.201611021]
　MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,11():728.[doi:10.11992/tis.201611021]
[8]王亚杰,邱虹坤,吴燕燕,等.计算机博弈的研究与发展[J].智能系统学报,2016,11(6):788.[doi:10.11992/tis.201609006]
　WANG Yajie,QIU Hongkun,WU Yanyan,et al.Research and development of computer games[J].CAAI Transactions on Intelligent Systems,2016,11():788.[doi:10.11992/tis.201609006]
[9]黄心汉.A3I:21世纪科技之光[J].智能系统学报,2016,11(6):835.[doi:10.11992/tis.201605022]
　HUANG Xinhan.A3I: the star of science and technology for the 21st century[J].CAAI Transactions on Intelligent Systems,2016,11():835.[doi:10.11992/tis.201605022]
[10]宋婉茹,赵晴晴,陈昌红,等.行人重识别研究综述[J].智能系统学报,2017,12(6):770.[doi:10.11992/tis.201706084]
　SONG Wanru,ZHAO Qingqing,CHEN Changhong,et al.Survey on pedestrian re-identification research[J].CAAI Transactions on Intelligent Systems,2017,12():770.[doi:10.11992/tis.201706084]
[11]王昌安,田金文.生成对抗网络辅助学习的舰船目标精细识别[J].智能系统学报,2020,15(2):296.[doi:10.11992/tis.201901004]
　WANG Changan,TIAN Jinwen.Fine-grained inshore ship recognition assisted by deep-learning generative adversarial networks[J].CAAI Transactions on Intelligent Systems,2020,15():296.[doi:10.11992/tis.201901004]
[12]王晓林,苏松志,刘晓颖,等.一种基于级联神经网络的飞机检测方法[J].智能系统学报,2020,15(4):697.[doi:10.11992/tis.201908028]
　WANG Xiaolin,SU Songzhi,LIU Xiaoying,et al.Cascade convolutional neural networks for airplane detection[J].CAAI Transactions on Intelligent Systems,2020,15():697.[doi:10.11992/tis.201908028]

备注/Memo

收稿日期:2023-10-17。
基金项目:国家自然科学基金项目(51365017，61463018)；江西省自然科学基金面上项目(20192BAB205084)；江西省教育厅科学技术研究重点项目(GJJ170491).
作者简介:梁礼明，教授，主要研究方向为机器学习、医学影像和系统建模。获得专利授权6项，发表学术论文100余篇，出版专著1部。E-mail：lianglm67@163.com;冯耀，硕士研究生，主要研究方向为深度学习与目标检测。E-mail：fybrave@126.com;龙鹏威，硕士研究生，主要研究方向为机器学习、模式识别与图像处理。E-mail：2637018663@qq.com。
通讯作者:梁礼明. E-mail：lianglm67@163.com

更新日期/Last Update: 2024-09-05

基于MobileViT和多尺度特征聚合的遥感图像目标检测 PDF下载HTML

备注/Memo

基于MobileViT和多尺度特征聚合的遥感图像目标检测

PDF下载 HTML