[1]LU Jun,ZHAO Haoran,LU Linchao.Research on 3D object detection based on multi-modal fusion[J].CAAI Transactions on Intelligent Systems,2025,20(5):1167-1177.[doi:10.11992/tis.202502015]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
20
Number of periods:
2025 5
Page number:
1167-1177
Column:
学术论文—机器感知与模式识别
Public date:
2025-09-05
- Title:
-
Research on 3D object detection based on multi-modal fusion
- Author(s):
-
LU Jun; ZHAO Haoran; LU Linchao
-
College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
-
- Keywords:
-
3D target detection; multimodal fusion; deep learning; depth estimation; feature aggregation; attention mechanism; LiDAR; autonomous driving
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202502015
- Abstract:
-
In the context of autonomous driving, the performance of 3D object detection via multimodal fusion is susceptible to insufficient sensor calibration. Additionally, in complex scenes with dense targets, the detection process is prone to false positives, thereby reducing the model’s recall and precision. To address these challenges, we have designed a multimodal fusion network, SoftFusion-QC (softFusion with query contrast), for 3D object detection. To adaptively fuse point cloud data from LiDAR with image information from cameras, we propose a Deformable cross-modality feature aggregate (DCFA) module, which facilitates deep-level feature fusion and effectively mitigates the issue of inadequate sensor calibration. To resolve the problem of false positives in dense object detection, we introduce a query contrast (QC) mechanism. By employing a Transformer-based query interaction strategy and a query box contrastive learning strategy, this mechanism significantly enhances detection accuracy and robustness. On the nuScenes autonomous driving dataset, our method achieves 69.8% mAP (mean average precision) and 72.8% NDS (normalized detection score). The effectiveness of our algorithm is validated through quantitative performance analysis and ablation studies.