<-上一篇/Previous Article 下一篇/Next Article->

[1]黄志鸿,杜瑞,张辉.面向复杂电力环境场景理解的可见光和红外图像特征级融合方法[J].智能系统学报,2025,20(3):631-640.[doi:10.11992/tis.202404014]
　HUANG Zhihong,DU Rui,ZHANG Hui.Feature-level fusion method of visible and infrared images for scene understanding in complex power environments[J].CAAI Transactions on Intelligent Systems,2025,20(3):631-640.[doi:10.11992/tis.202404014]

点击复制

面向复杂电力环境场景理解的可见光和红外图像特征级融合方法

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 20 期数: 2025年第3期页码: 631-640 栏目: 学术论文—机器感知与模式识别出版日期: 2025-05-05

Title:: Feature-level fusion method of visible and infrared images for scene understanding in complex power environments

作者:: 黄志鸿^1,2, 杜瑞³, 张辉³; 1. 国网湖南省电力有限公司电力科学研究院, 湖南长沙 410017;
2. 湖南省湘电试验研究院有限公司, 湖南长沙 410017;
3. 湖南大学机器人视觉感知与控制技术国家工程研究中心, 湖南长沙 410082

Author(s):: HUANG Zhihong^1,2, DU Rui³, ZHANG Hui³; 1. Electric Power Research Institute, State Grid Hu’nan Electric Power Company Limited, Changsha 410017, China;
2. Hu’nan Xiangdian Test and Research Institute Co., Ltd., Changsha 410017, China;
3. Engineering Research Center for Robot Visual Perception and Control Technology, Hu’nan University, Changsha 410082, China

关键词:: 特征级融合; 场景理解; 电力系统监测; 变电站设备; 智能化电网; 多模态融合; 图像语义分割; 红外可见光图像

Keywords:: feature-level fusion; scenario understanding; power system monitoring; substation equipment; intelligent grid; multimodal fusion; image semantic segmentation; infrared-visible image

分类号:: TP391.4

DOI:: 10.11992/tis.202404014

摘要:: 随着电力系统自动化和智能化程度的不断提高，变电站和配电网设备的有效监测与故障诊断成为保证电网稳定运行的重要手段。针对传统单模态图像处理方法在复杂电力环境中面临的挑战，本文提出了一种基于可见光和红外图像特征级融合的场景理解方法。通过深入分析可见光图像和红外图像的互补特性，设计了一个双分支的对称融合网络框架，有效结合了可见光图像的高分辨率纹理信息和红外图像的温度信息。此外，引入多尺度特征融合层和多尺度注意力解码器，以提高模型的分割精度和细节恢复能力。实验结果表明，该方法在变电站设备监测中取得了优异的性能，尤其是在处理光照不足和遮挡情况下的图像时，展现出了较好的鲁棒性。该研究不仅为复杂电力环境的监测提供了一种有效的技术手段，而且对于推动电力系统智能化管理具有重要的理论和实践意义。

Abstract:: With the continuous increase in the automation and intelligence levels of power systems, the effective monitoring and fault diagnosis of substation and distribution network equipment have become crucial to ensuring stable grid operation. To address the challenges faced by traditional single-modal image processing methods in complex power environments, a scene understanding method based on the feature-level fusion of visible and infrared images is proposed here. By deeply analyzing the complementary characteristics of visible and infrared images, a dual-branch symmetric fusion network framework is designed, and it effectively integrates the high-resolution texture information of visible images with the temperature information of infrared images. Furthermore, multi-scale feature fusion layers and multi-scale attention decoders are introduced to enhance the segmentation precision and detail recovery capabilities of the model. The experimental results reveal that this method performs excellently in substation equipment monitoring, particularly demonstrating good robustness in processing images under insufficient lighting and occlusion conditions. This research presents an effective technical approach for monitoring complex power environments and offers significant theoretical and practical implications for advancing intelligent management in power systems.

参考文献/References:: [1] 傅博, 姜勇, 王洪光, 等. 输电线路巡检图像智能诊断系统[J]. 智能系统学报, 2016, 11(1): 70-77.
FU Bo, JIANG Yong, WANG Hongguang, et al. Intelligent diagnosis system for patrol check images of power transmission lines[J]. CAAI transactions on intelligent systems, 2016, 11(1): 70-77.
[2] 张铭泉, 邢福德, 刘冬. 基于改进Faster R-CNN的变电站设备外部缺陷检测[J]. 智能系统学报, 2024, 19(2): 290-298.
ZHANG Mingquan, XING Fude, LIU Dong. External defect detection of transformer substation equipment based on improved Faster R-CNN[J]. CAAI transactions on intelligent systems, 2024, 19(2): 290-298.
[3] 冯晗, 姜勇. 使用改进Yolov5的变电站绝缘子串检测方法[J]. 智能系统学报, 2023, 18(2): 325-332.
FENG Han, JIANG Yong. A substation insulator string detection method based on an improved Yolov5[J]. CAAI transactions on intelligent systems, 2023, 18(2): 325-332.
[4] 黄志鸿, 颜星雨, 陶岩, 等. 基于多模态图像信息的配电网部件定位方法[J]. 湖南电力, 2024, 44(6): 83-89.
HUANG Zhihong, YAN Xingyu, TAO Yan, et al. Distribution network component positioning methods based on multi-modal image information[J]. Hunan electric power, 2024, 44(6): 83-89.
[5] 张辉, 杜瑞, 钟杭, 等. 电力设施多模态精细化机器人巡检关键技术及应用[J]. 自动化学报, 2025, 51(1): 20-42.
ZHANG Hui, DU Rui, ZHONG Hang, et al. The key technology and application of multi-modal fine robot inspection for power facilities[J]. Acta automatica sinica, 2025, 51(1): 20-42.
[6] 陶岩, 张辉, 黄志鸿, 等. 面向配电网典型部件的热故障精准判别方法[J]. 智能系统学报, 2025, 20(2): 506-515.
TAO Yan, ZHANG Hui, HUANG Zhihong, et al. Accurate identification of thermal faults for typical components of distribution networks[J]. CAAI transactions on intelligent systems, 2025, 20(2): 506-515.
[7] CHOI H, YUN J P, KIM B J, et al. Attention-based multimodal image feature fusion module for transmission line detection[J]. IEEE transactions on industrial informatics, 2022, 18(11): 7686-7695.
[8] XU Chang, LI Qingwu, JIANG Xiongbiao, et al. Dual-space graph-based interaction network for RGB-thermal semantic segmentation in electric power scene[J]. IEEE transactions on circuits and systems for video technology, 2023, 33(4): 1577-1592.
[9] ZHOU Wujie, LIU Jinfu, LEI Jingsheng, et al. GMNet: graded-feature multilabel-learning network for RGB-thermal urban scene semantic segmentation[J]. IEEE transactions on image processing, 2021, 30: 7790-7802.
[10] ZHOU Wujie, DONG Shaohua, XU Caie, et al. Edge-aware guidance fusion network for RGB-thermal scene parsing[EB/OL]. (2021-12-09)[2024-01-01]. https://arxiv.org/abs/2112.05144.
[11] LIU Jinfu, ZHOU Wujie, CUI Yueli, et al. GCNet: Grid-like context-aware network for RGB-thermal semantic segmentation[J]. Neurocomputing, 2022, 506: 60-67.
[12] LIN Baihong, LIN Zengrong, GUO Yulan, et al. Variational probabilistic fusion network for RGB-T semantic segmentation[EB/OL]. (2023-06-17)[2024-01-01]. https://arxiv.org/abs/2307.08536.
[13] LI Ping, CHEN Junjie, LIN Binbin, et al. Residual spatial fusion network for RGB-thermal semantic segmentation[EB/OL]. (2023-06-17)[2024-01-01]. https://arxiv.org/abs/2306.10364.
[14] ZHANG Jiyou, ZHANG Rongfen, LIU Yuhong, et al. RGB-T semantic segmentation based on cross-operational fusion attention in autonomous driving scenario[J]. Evolving systems, 2024, 15: 1429-1440.
[15] LIN Zengrong, LIN Baihong, GUO Yulan. Label-guided real-time fusion network for RGB-T semantic segmentation[C]//Proceedings of the British Machine Vision Conference. Aberdeen: BMVC, 2023: 767-770.
[16] SUN Yuxiang, ZUO Weixun, YUN Peng, et al. FuseSeg: semantic segmentation of urban scenes based on RGB and thermal data fusion[J]. IEEE transactions on automation science and engineering, 2021, 18(3): 1000-1011.
[17] ZHOU Wujie, LIN Xinyang, LEI Jingsheng, et al. MFFENet: multiscale feature fusion and enhancement network for RGB-thermal urban road scene parsing[J]. IEEE transactions on multimedia, 2021, 24: 2526-2538.
[18] DENG Fuqin, FENG Hua, LIANG Mingjian, et al. FEANet: feature-enhanced attention network for RGB-thermal real-time semantic segmentation[C]//2021 IEEE/RSJ International Conference on Intelligent Robots and Systems. Prague: IEEE, 2021: 4467–4473.
[19] FAN Siqi, WANG Zhe, WANG Yan, et al. SpiderMesh: spatial-aware demand-guided recursive meshing for RGB-T semantic segmentation[EB/OL]. (2023-09-27)[2024-01-01]. https://arxiv.org/abs/2303.08692v2.
[20] LIN Baihong, LIN Zengrong, GUO Yulan, et al. Asymmetric multimodal guidance fusion network for real-time visible–thermal semantic segmentation[J]. Robotics and computer-integrated manufacturing, 2024, 86: 103822.
[21] ZHANG Jiaming, LIU Huayao, YANG Kailun, et al. CMX: cross-modal fusion for RGB-X semantic segmentation with transformers[J]. IEEE transactions on intelligent transportation systems, 24(12): 14679-14694.
[22] LI Gongyang, WANG Yike, LIU Zhi, et al. RGB-T semantic segmentation with location, activation, and sharpening[J]. IEEE transactions on circuits and systems for video technology, 2022, 33(3): 1223–1235.
[23] SHIN U, LEE K, KWEON I S, et al. Complementary random masking for RGB-thermal semantic segmentation[C]// 2024 IEEE International Conference on Robotics and Automation. Yokohama: IEEE, 2014: 11110-11117.
[24] WANG Yuxin, LI Gongyang, LIU Zhi. SGFNet: semantic-guided fusion network for RGB-thermal semantic segmentation[J]. IEEE transactions on circuits and systems for video technology, 2023, 33(12): 7737-7748.
[25] LI Gongyang, WANG Yike, LIU Zhi, et al. RGB-T semantic segmentation with location, activation, and sharpening[J]. IEEE transactions on circuits and systems for video technology, 2023, 33(3): 1223-1235.
[26] ZHOU Zikun, WU Shukun, ZHU Guoqing, et al. Channel and spatial relation-propagation network for RGB-thermal semantic segmentation[EB/OL]. (2023-08-24)[2024-01-01]. https://arxiv.org/abs/2308.12534.
[27] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. (2021-06-03)[2024-01-01]. https://arxiv.org/abs/2010.11929.
[28] MAO Anqi, MOHRI Mehryar, ZHONG Yutao. Cross-entropy loss functions: theoretical analysis and applications[EB/OL]. (2023-06-15)[2024-01-01]. https://arxiv.org/pdf/2304.07288v1.
[29] ZHANG Zhi, SABUNCU M R. Generalized cross entropy loss for training deep neural networks with noisy labels[J]. Advances in neural information processing systems, 2018, 31: 8792-8802.
[30] SUDRE C H, LI Wenqi, VERCAUTEREN T, et al. Generalised Dice overlap as a deep learning loss function for highly unbalanced segmentations[EB/OL]. (2017-07-14)[2024-01-01]. https://arxiv.org/abs/1707.03237.
[31] LI Xiaoya, SUN Xiaofei, MENG Yuxian, et al. Dice loss for data-imbalanced NLP tasks[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, 2020: 465–476. ?

备注/Memo

收稿日期:2024-4-16。
基金项目:国网湖南省电力有限公司科技项目(5216A522001Y).
作者简介:黄志鸿，高级工程师，博士研究生，主要研究方向为电力人工智能。E-mail：zhihong_huang111@163.com。;杜瑞，博士研究生，主要研究方向为电力人工智能、多模态感知。E-mail：durui@hnu.edu.cn。;张辉，教授，博士生导师，博士，主要研究方向为机器人视觉检测、深度学习、图像识别、机器人智能控制、嵌入式系统应用。近年来，主持科技创新2030—“新一代人工智能”重大项目课题、国家自然科学基金共融机器人重大研究计划重点项目，国家重点研发计划子课题、国家科技支撑计划项目子课题等20余项。技术成果获2018年国家技术发明奖二等奖，以主要完成人获得省部级科学技术奖励一等奖8项。发表学术论文50余篇，获国家发明专利授权38项、计算机软件著作权5项。E-mail：zhanghuihby@126.com。
通讯作者:张辉. E-mail：zhanghuihby@126.com

更新日期/Last Update: 1900-01-01

面向复杂电力环境场景理解的可见光和红外图像特征级融合方法 PDF下载HTML

备注/Memo

面向复杂电力环境场景理解的可见光和红外图像特征级融合方法

PDF下载 HTML