[1]陆军,赵颢然,鲁林超.基于多模态融合的三维目标检测方法研究[J].智能系统学报,2025,20(5):1167-1177.[doi:10.11992/tis.202502015]
LU Jun,ZHAO Haoran,LU Linchao.Research on 3D object detection based on multi-modal fusion[J].CAAI Transactions on Intelligent Systems,2025,20(5):1167-1177.[doi:10.11992/tis.202502015]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
20
期数:
2025年第5期
页码:
1167-1177
栏目:
学术论文—机器感知与模式识别
出版日期:
2025-09-05
- Title:
-
Research on 3D object detection based on multi-modal fusion
- 作者:
-
陆军, 赵颢然, 鲁林超
-
哈尔滨工程大学 智能科学与工程学院, 黑龙江 哈尔滨 150001
- Author(s):
-
LU Jun, ZHAO Haoran, LU Linchao
-
College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
-
- 关键词:
-
三维目标检测; 多模态融合; 深度学习; 深度估计; 特征聚合; 注意力机制; 激光雷达; 自动驾驶
- Keywords:
-
3D target detection; multimodal fusion; deep learning; depth estimation; feature aggregation; attention mechanism; LiDAR; autonomous driving
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202502015
- 摘要:
-
在自动驾驶场景中,由于多模态的融合,三维目标检测效果易受传感器未充分校准的影响,同时,对于目标密集的复杂场景,检测过程中易对目标造成误检,从而降低模型的召回率和检测精度。针对以上问题,设计了多模态融合网络SoftFusion-QC(softfusion with query contrast)用以实现三维目标检测。为了自适应地融合来自激光雷达的点云数据和摄像头捕获的图像信息,提出可变形跨模态特征聚合模块(deformable cross-modality feature aggregate, DCFA),实现深层次的特征融合。为了有效应对传感器校准不足问题,引入查询对比机制(query contrast, QC),通过基于Transformer的查询交互策略和查询框对比学习策略,显著提升了检测的精度和鲁棒性,解决了密集目标检测的误检问题。在nuScenes自动驾驶数据集上,取得了69.8%的mAP(mean average precision)与72.8%的NDS(normalized detection score)。通过定量的性能分析和消融实验验证了算法的有效性。
- Abstract:
-
In the context of autonomous driving, the performance of 3D object detection via multimodal fusion is susceptible to insufficient sensor calibration. Additionally, in complex scenes with dense targets, the detection process is prone to false positives, thereby reducing the model’s recall and precision. To address these challenges, we have designed a multimodal fusion network, SoftFusion-QC (softFusion with query contrast), for 3D object detection. To adaptively fuse point cloud data from LiDAR with image information from cameras, we propose a Deformable cross-modality feature aggregate (DCFA) module, which facilitates deep-level feature fusion and effectively mitigates the issue of inadequate sensor calibration. To resolve the problem of false positives in dense object detection, we introduce a query contrast (QC) mechanism. By employing a Transformer-based query interaction strategy and a query box contrastive learning strategy, this mechanism significantly enhances detection accuracy and robustness. On the nuScenes autonomous driving dataset, our method achieves 69.8% mAP (mean average precision) and 72.8% NDS (normalized detection score). The effectiveness of our algorithm is validated through quantitative performance analysis and ablation studies.
备注/Memo
收稿日期:2025-2-26。
基金项目:黑龙江省自然科学基金项目(F201123).
作者简介:陆军,教授,博士生导师,博士,主要研究方向为计算机视觉、机器感知和机械臂控制。科技部科技型中小企业创新基金项目评审专家,国家自然科学基金同行评议专家。发表学术论文80余篇,出版著作5部。E-mail:lujun0260@sina.com。;赵颢然,硕士研究生,主要研究方向为三维目标检测、计算机视觉。E-mail:1793961894@qq.com。;鲁林超,硕士,主要研究方向为三维目标检测、计算机视觉。E-mail: llczsr@163.com。
通讯作者:陆军. E-mail:lujun0260@sina.com
更新日期/Last Update:
2025-09-05