[1]LU Jun,ZOU Kangcheng,LI Yang.Feature flow-based point cloud object detection method[J].CAAI Transactions on Intelligent Systems,2026,21(1):146-155.[doi:10.11992/tis.202503005]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
21
Number of periods:
2026 1
Page number:
146-155
Column:
学术论文—智能系统
Public date:
2026-03-05
- Title:
-
Feature flow-based point cloud object detection method
- Author(s):
-
LU Jun; ZOU Kangcheng; LI Yang
-
College of Intelligent Science and Engineering, Harbin Engineering University, Harbin 150001, China
-
- Keywords:
-
lidar point cloud; object detection; feature flow; feature alignment; temporal feature fusion; deformable attention mechanism; bird’s-eye view; multi-frame point cloud fusion
- CLC:
-
TP391
- DOI:
-
10.11992/tis.202503005
- Abstract:
-
Aiming at the problem of missing scene information and missing target detection caused by the sparsity of point cloud in the existing 3D target detection method of lidar point cloud, this paper proposes a single-stage 3D target detection algorithm based on feature flow, and the algorithm optimizes the detection performance through multi-frame spatio-temporal feature fusion and dynamic alignment mechanism. Firstly, a multi-frame fusion framework driven by gated network is constructed. The deformable attention mechanism is used to cooperate with the spatio-temporal feature extraction module to realize the dynamic alignment of cross-frame features and suppress the false detection caused by unaligned feature fusion. Secondly, a deformable attention mechanism guided by spatio-temporal features is designed to predict feature offset and weight through target motion information, so as to improve the feature matching accuracy of sparse point clouds. Finally, a hierarchical feature flow extraction module is designed to enhance the scene representation ability by combining multi-scale feature extraction and progressive fusion strategy. Experiments show that the proposed algorithm achieves 63.73% mAP on the NuScenes verification set, which is 4.51% higher than the voxel benchmark method, and the detection accuracy of small targets such as motorcycles and bicycles is improved by more than 14%. Ablation experiments show that the multi-frame complementary mechanism increases the recall rate of long-distance targets (>50 m) by 16.2%, and reduces the missed detection rate of occlusion scenes by 11.8%. This study provides an effective solution for three-dimensional detection of sparse point clouds for autonomous driving.