[1]刘诗怡,刘金平,黄丽娟,等.基于多尺度协调卷积与自适应加权的红外与可见光图像融合[J].智能系统学报,2026,21(1):95-108.[doi:10.11992/tis.202504002]
LIU Shiyi,LIU Jinping,HUANG Lijuan,et al.Infrared and visible image fusion based on multi-scale coordinated convolution and adaptive weighting[J].CAAI Transactions on Intelligent Systems,2026,21(1):95-108.[doi:10.11992/tis.202504002]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
21
期数:
2026年第1期
页码:
95-108
栏目:
学术论文—机器感知与模式识别
出版日期:
2026-03-05
- Title:
-
Infrared and visible image fusion based on multi-scale coordinated convolution and adaptive weighting
- 作者:
-
刘诗怡1, 刘金平1, 黄丽娟2, 蒋嘉豪1, 宋殿义3, 杨广益4
-
1. 湖南师范大学 信息科学与工程学院, 湖南 长沙 410081;
2. 湖南省智能康复机器人与辅助设备工程技术研究中心, 湖南 长沙 410004;
3. 国防科技大学 军政基础教育学院, 湖南 长沙 410072;
4. 湖南省计量检测院, 湖南 长沙 410081
- Author(s):
-
LIU Shiyi1, LIU Jinping1, HUANG Lijuan2, JIANG Jiahao1, SONG Dianyi3, YANG Guangyi4
-
1. College of Information Science and Engineering, Hunan Normal University, Changsha 410081, China;
2. Hunan Intelligent Rehabilitation Robot and Auxiliary Equipment Engineering Technology Research Center, Changsha 410004, China;
3. Basic Education College, National University of Defense Technology, Changsha 410072, China;
4. Hunan Institute of Metrology and Testing, Changsha 410081, China
-
- 关键词:
-
图像融合; 红外图像; 可见光图像; 多尺度协调卷积; 卷积加权重排多层感知器; 坐标注意力; 自适应权重
- Keywords:
-
image fusion; infrared image; visible image; multiscale coordinate convolution; convolutional multilayer perceptron; coordinate attention; adaptive weighting
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202504002
- 摘要:
-
针对当前基于卷积神经网络的图像融合模型在全局信息感知、高频细节保持及损失函数权重设定上的局限性,提出一种集成卷积和多层感知器架构的多尺度协调网络,以实现红外与可见光图像的高质量融合。提出一种卷积加权重排多层感知器模块,通过模拟特征排列增强空间维度理解,并结合自适应特征重加权机制有效整合全局信息。同时,提出多尺度协调卷积模块,利用中心差分卷积增强高频信息的保留能力,并通过多尺度并行子网络优化多层次特征表达;其内嵌的坐标注意力机制,通过通道–空间联合调制强化互补信息并抑制冗余特征。此外,还提出一种数据驱动的自适应权重策略,基于图像特征统计量动态调整监督信号的贡献度,降低调参复杂性并提升损失函数的自适应性。在RoadScene、TNO和M3FD这3个公开数据集上的实验结果表明,本文算法生成的融合图像在边缘保持、纹理过渡方面表现更优,且在信息熵、标准差、空间频率、视觉信息保真度和平均梯度等指标上全面超越主流融合方法,为红外与可见光图像融合提供了新的思路,为图像融合领域的进一步发展打下了坚实的基础。
- Abstract:
-
To address the limitations of convolution neural networks-based image fusion models, such as restricted global information perception, high-frequency detail preservation, and the loss function weights configuration, this article proposes a convolution and multilayer perceptron-integrated multiscale coordinate network (CM-MCNet) for high-quality infrared and visible image fusion. In the encoder of CM-McNet, a convolutional weighted permute multilayer perceptron module is introduced to enhance spatial understanding by simulating feature permutation and integrates an adaptive feature reweighting mechanism to effectively capture global information. Meanwhile, a multiscale coordinate convolution (MsCConv) module is designed, leveraging the advantages of central difference convolution to enhance the retention and expression of high-frequency details. By incorporating multiscale parallel sub-networks, MsCConv ensures the comprehensive preservation of multi-level features. Moreover, the embedded coordinate attention mechanism jointly modulates channel and spatial dimensions, enhancing complementary information while suppressing redundancy. Furthermore, a data-driven adaptive loss weighting strategy is proposed, which can dynamically adjust the contribution of supervision signals based on image feature statistics. This reduces the complexity of hyperparameter tuning while ensuring the loss function more accurately reflects the characteristics of the source images. Experimental results on the RoadScene, TNO, and M3FD public datasets demonstrate that CM-MCNet generates fused images with sharper edge preservation and more natural texture transitions. Additionally, our method achieves superior performance across various objective metrics, including information entropy, standard deviation, spatial frequency, visual information fidelity, and average gradient, outperforming existing state-of-the-art fusion methods. This work provides a novel perspective for infrared and visible image fusion and lays a solid foundation for further advancements in the field.
备注/Memo
收稿日期:2025-4-1。
基金项目:国家自然科学基金项目(62371187);湖南省自然科学基金项目(2024JJ8309).
作者简介:刘诗怡,硕士研究生,主要研究方向为机器学习、计算机视觉和图像处理。E-mail:liushiyi@hunnu.edu.cn。;刘金平,教授,博士生导师,主要研究方向为机器学习、模式识别、工业过程监测、故障诊断、计算机视觉。主持、参与国家和省部级科研课题 10 余项,获国家发明专利授权20项。发表学术论文80 余篇。E-mail:ljp@hunnu.edu.cn。;黄丽娟,讲师,主要研究方向为智能控制、机器学习和工业过程控制。主持、参与省部级和市厅级科研课题5项,获国家发明专利授权6项。E-mail:huanglijuan@csmzxy.edu.cn。
通讯作者:刘金平. E-mail:ljp@hunnu.edu.cn
更新日期/Last Update:
2026-01-05