[1]赵文清,李溢晔.基于动态超图与多尺度特征融合的遥感图像目标检测[J].智能系统学报,2026,21(2):399-409.[doi:10.11992/tis.202508009]
ZHAO Wenqing,LI Yiye.Remote sensing image object detection based on dynamic hypergraphs and multi-scale feature fusion[J].CAAI Transactions on Intelligent Systems,2026,21(2):399-409.[doi:10.11992/tis.202508009]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
21
期数:
2026年第2期
页码:
399-409
栏目:
学术论文—机器学习
出版日期:
2026-03-05
- Title:
-
Remote sensing image object detection based on dynamic hypergraphs and multi-scale feature fusion
- 作者:
-
赵文清1,2, 李溢晔1
-
1. 华北电力大学 控制与计算机工程学院, 河北 保定 071003;
2. 河北省能源电力知识计算重点实验室, 河北 保定 071003
- Author(s):
-
ZHAO Wenqing1,2, LI Yiye1
-
1. School of Control and Computer Engineering, North China Electric Power University, Baoding 071003, China;
2. Hebei Key Laboratory of Knowledge Computing for Energy & Power, Baoding 071003, China
-
- 关键词:
-
遥感图像; 目标检测; 多尺度; 特征融合; 动态超图; 语义特征; 坐标注意力; 空洞卷积
- Keywords:
-
remote sensing image; object detection; multi-scale; feature fusion; dynamic hypergraph; semantic features; coordinate attention; dilated convolution
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202508009
- 摘要:
-
遥感图像目标尺度差异大、背景复杂,而目标检测模型存在多尺度感知能力不足、全局语义特征建模能力差的问题。本文提出了一种基于动态超图与多尺度特征融合的遥感图像目标检测模型。构造多尺度空洞卷积特征融合模块,并设计对应的特征提取网络,充分提取多尺度特征;构造动态门控超图模块,以此构建全局语义特征建模网络,强化对目标特征区域的感知,弱化复杂背景的干扰;提出多通道坐标注意力模块,结合坐标注意力机制与多尺度通道交互,增强特征的表达。在DIOR和RSOD两个数据集上设计了多组消融实验,结果表明,本模型在DIOR数据集与RSOD数据集上的平均精度均值比YOLO11模型分别提升2.5和2.3百分点,显著提升了遥感图像目标检测的精度。为验证本模型的有效性,设计对比实验与不同方法比较,实验结果证明本模型的检测效果优于其他方法。
- Abstract:
-
Remote sensing images exhibit significant variations in target scales and complex backgrounds, while existing object detection models suffer from limited multi-scale perception and insufficient global semantic modeling capabilities. To address these challenges, a remote sensing object detection framework based on dynamic hypergraph and multi-scale feature fusion was proposed. First, a multi-scale dilated convolution feature fusion module was constructed, and a feature extraction network was designed to fully extract multi-scale features. Second, a dynamic gated hypergraph module was developed to establish a global semantic feature modeling network, which enhanced target feature perception while weakening complex background interference. Finally, a multi-channel coordinate attention module was presented, combining coordinate attention mechanisms with multi-scale channel interactions to strengthen feature representation. Ablation experiments are conducted on the DIOR and the RSOD datasets, demonstrating that the proposed model achieves 2.5 and 2.3 percent age point improvements in mean average precision over the YOLO11 baseline. Comparative experiments validate the superiority of the proposed model, showing enhanced detection performance against other methods.
更新日期/Last Update:
1900-01-01