[1]ZHOU Jing,HU Yiyu,HUANG Xinhan.Shape completion-guided Transformer point cloud object detection method[J].CAAI Transactions on Intelligent Systems,2023,18(4):731-742.[doi:10.11992/tis.202210038]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
18
Number of periods:
2023 4
Page number:
731-742
Column:
学术论文—机器感知与模式识别
Public date:
2023-07-15
- Title:
-
Shape completion-guided Transformer point cloud object detection method
- Author(s):
-
ZHOU Jing1; HU Yiyu1; HUANG Xinhan2
-
1. School of Artificial Intelligence, Jianghan University, Wuhan 430056, China;
2. School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
-
- Keywords:
-
3D object detection; low-quality object; feature separation; shape completion; Transformer; multi-scale; neighboring mask; feature enhancement
- CLC:
-
TP391.41
- DOI:
-
10.11992/tis.202210038
- Abstract:
-
Aiming at the problem that in the point cloud of scenes collected by the LIDAR sensor, there are lots of low-quality objects with missing shapes due to long distance or occlusion, whose geometric information are too insufficient to be recognized, so that the detection accuracy is affected. Hence, a shape completion-guided Transformer point cloud object detection method (STDet) is proposed to improve the object detection precision by enhancing shape features of the low-quality objects. The features of the point clouds are acquired by the Pointformer backbone network to generate the initial candidate box. Then, the shape completion module predicted based on feature separation is designed to reconstruct a complete shape of point clouds of the incomplete objects within the candidate box. A Transformer geometric feature enhancement module is established, which integrates the complete shape information and spatial location knowledge of the object into its point-wise feature to perceive the attention correlation between the local structure information and the global geometric features within different neighborhood masks, so as to acquire the global geometric feature with enhanced critical geometric knowledge of the objects. Finally, the refined object detection boxes are generated under the guidance of global geometric features. Experimental results on KITTI data set show that compared with the benchmark algorithm, the proposed method improves detection accuracy by 4.96% in scenes with abundant low-quality objects of incomplete shapes. Meanwhile, the effectiveness of the proposed shape completion algorithm and Transformer geometric feature encoding module is proved by extensive ablation experiments.