<-上一篇/Previous Article 下一篇/Next Article->

[1]陆军,邹康成,李杨.基于特征流的点云目标检测方法[J].智能系统学报,2026,21(1):146-155.[doi:10.11992/tis.202503005]
　LU Jun,ZOU Kangcheng,LI Yang.Feature flow-based point cloud object detection method[J].CAAI Transactions on Intelligent Systems,2026,21(1):146-155.[doi:10.11992/tis.202503005]

点击复制

基于特征流的点云目标检测方法

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 21 期数: 2026年第1期页码: 146-155 栏目: 学术论文—智能系统出版日期: 2026-01-05

Title:: Feature flow-based point cloud object detection method

作者:: 陆军, 邹康成, 李杨; 哈尔滨工程大学智能科学与工程学院, 黑龙江哈尔滨 150001

Author(s):: LU Jun, ZOU Kangcheng, LI Yang; College of Intelligent Science and Engineering, Harbin Engineering University, Harbin 150001, China

关键词:: 激光雷达点云; 目标检测; 特征流; 特征对齐; 时序特征融合; 可变形注意力机制; 鸟瞰视角表示; 多帧点云融合

Keywords:: lidar point cloud; object detection; feature flow; feature alignment; temporal feature fusion; deformable attention mechanism; bird’s-eye view; multi-frame point cloud fusion

分类号:: TP391

DOI:: 10.11992/tis.202503005

摘要:: 针对现有激光雷达点云三维目标检测方法因点云稀疏性导致的场景信息缺失与目标漏检问题，本文提出一种基于特征流的单阶段三维目标检测算法，该算法通过多帧时空特征融合与动态对齐机制优化检测性能。首先，构建门控网络驱动的多帧融合框架，利用可变形注意力机制协同时空特征提取模块，实现跨帧特征的动态对齐，抑制未对齐特征融合导致的误检；其次，设计时空特征引导的可变形注意力机制，通过目标运动信息预测特征偏移与权重，提升稀疏点云的特征匹配精度；最后，设计层级式特征流提取模块，结合多尺度特征提取与渐进融合策略，增强场景表征能力。实验结果表明，所提算法在NuScenes验证集上的平均精度均值达到63.73%，较体素基准方法提升4.51%，其中摩托车、自行车等小目标检测精度提升超过14%。消融实验结果表明，多帧互补机制使远距离目标(>50 m)召回率提升16.2%，遮挡场景漏检率降低11.8%。本研究为自动驾驶领域稀疏点云三维检测提供了有效方案。

Abstract:: Aiming at the problem of missing scene information and missing target detection caused by the sparsity of point cloud in the existing 3D target detection method of lidar point cloud, this paper proposes a single-stage 3D target detection algorithm based on feature flow, and the algorithm optimizes the detection performance through multi-frame spatio-temporal feature fusion and dynamic alignment mechanism. Firstly, a multi-frame fusion framework driven by gated network is constructed. The deformable attention mechanism is used to cooperate with the spatio-temporal feature extraction module to realize the dynamic alignment of cross-frame features and suppress the false detection caused by unaligned feature fusion. Secondly, a deformable attention mechanism guided by spatio-temporal features is designed to predict feature offset and weight through target motion information, so as to improve the feature matching accuracy of sparse point clouds. Finally, a hierarchical feature flow extraction module is designed to enhance the scene representation ability by combining multi-scale feature extraction and progressive fusion strategy. Experiments show that the proposed algorithm achieves 63.73% mAP on the NuScenes verification set, which is 4.51% higher than the voxel benchmark method, and the detection accuracy of small targets such as motorcycles and bicycles is improved by more than 14%. Ablation experiments show that the multi-frame complementary mechanism increases the recall rate of long-distance targets (>50 m) by 16.2%, and reduces the missed detection rate of occlusion scenes by 11.8%. This study provides an effective solution for three-dimensional detection of sparse point clouds for autonomous driving.

参考文献/References:: [1] HERRMANN L, KOLLMANNSBERGER S. Deep learning in computational mechanics: a review[J]. Computational mechanics, 2024, 74(2): 281-331
[2] ZHAO Xia, WANG Limin, ZHANG Yufei, et al. A review of convolutional neural networks in computer vision[J]. Artificial intelligence review, 2024, 57(4): 99
[3] KHEDDAR H, HEMIS M, HIMEUR Y. Automatic speech recognition using advanced deep learning approaches: a survey[J]. Information fusion, 2024, 109: 102422
[4] TORFI A, SHIRVANI R A, KENESHLOO Y, et al. Natural language processing advancements by deep learning: a survey[EB/OL]. (2020-03-02)[2025-03-04]. https://arxiv.org/abs/2003.01200.
[5] U?INSKIS V, MAKULAVI?IUS M, PETKEVI?IUS S, et al. Towards autonomous driving: technologies and data for vehicles-to-everything communication[J]. Sensors, 2024, 24(11): 3411
[6] 徐向阳, 胡文浩, 董红磊, 等. 自动驾驶汽车测试场景构建关键技术综述[J]. 汽车工程, 2021, 43(4): 610-619 XU Xiangyang, HU Wenhao, DONG Honglei, et al. Review of key technology for autonomous vehicle test scenario construction[J]. Automotive engineering, 2021, 43(4): 610-619
[7] FAN Lili, WANG Junhao, CHANG Yuanmeng, et al. 4D mmWave radar for autonomous driving perception: a comprehensive survey[J]. IEEE transactions on intelligent vehicles, 2024, 9(4): 4606-4620
[8] LEI Han, WANG Baoming, SHUI Zuwei, et al. Automated lane change behavior prediction and environmental perception based on SLAM technology[EB/OL]. (2024-04-06)[2025-03-04]. https://arxiv.org/abs/2404.04492.
[9] XIE Jing, ABBASS K, LI Di. Advancing eco-excellence: Integrating stakeholders’ pressures, environmental awareness, and ethics for green innovation and performance[J]. Journal of environmental management, 2024, 352: 120027
[10] LI Ying, MA Lingfei, ZHONG Zilong, et al. Deep learning for LiDAR point clouds in autonomous driving: a review[J]. IEEE transactions on neural networks and learning systems, 2021, 32(8): 3412-3432
[11] 李佳男, 王泽, 许廷发. 基于点云数据的三维目标检测技术研究进展[J]. 光学学报, 2023, 43(15): 1515001 LI Jianan, WANG Ze, XU Tingfa. Three-dimensional object detection technology based on point cloud data[J]. Acta optica sinica, 2023, 43(15): 1515001
[12] JHALDIYAL A, CHAUDHARY N. Semantic segmentation of 3D LiDAR data using deep learning: a review of projection-based methods[J]. Applied intelligence, 2023, 53(6): 6844-6855
[13] POUX F, BILLEN R, POUX F, et al. Voxel-based 3D point cloud semantic segmentation: unsupervised geometric and relationship featuring vs deep learning methods[J]. ISPRS international journal of geo-information, 2019, 8(5): 213.
[14] XU Xiaobin, ZHANG Lei, YANG Jian, et al. Object detection based on fusion of sparse point cloud and image information[J]. IEEE transactions on instrumentation and measurement, 2021, 70: 2512412
[15] LIU Ruihua, NAN Haoyu, ZOU Yangyang, et al. AS-3DFCN: automatically seeking 3DFCN-based brain tumor segmentation[J]. Cognitive computation, 2023, 15(6): 2034-2049
[16] WANG Jianfeng, SONG Lin, LI Zeming, et al. End-to-end object detection with fully convolutional network[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual: IEEE, 2021: 15844-15853.
[17] NGUYEN D A, HOANG K N, NGUYEN N T, et al. Enhancing indoor robot pedestrian detection using improved PIXOR backbone and Gaussian heatmap regression in 3D LiDAR point clouds[J]. IEEE access, 2024, 12: 9162-9176
[18] XIE Enze, YU Zhiding, ZHOU Daquan, et al. M^2BEV: multi-camera joint 3D detection and segmentation with unified birds-eye view representation[EB/OL]. (2022-04-11)[2025-03-04]. https://arxiv.org/abs/2204.05088.
[19] CHEN Yukang, LIU Jianhui, ZHANG Xiangyu, et al. VoxelNeXt: fully sparse VoxelNet for 3D object detection and tracking[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 21674-21683.
[20] Vision and pattern Recognition. 2023: 21674-21683.
[21] SHI Shaoshuai, GUO Chaoxu, JIANG Li, et al. PV-RCNN: point-voxel feature set abstraction for 3D object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10526-10535.
[22] CHARLES R Q, HAO Su, MO Kaichun, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 77-85.
[23] QI C R, YI Li, SU Hao, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[EB/OL]. (2017-06-07)[2025-03-04]. https://arxiv.org/abs/1706.02413.
[24] SHI Shaoshuai, WANG Xiaogang, LI Hongsheng. PointRCNN: 3D object proposal generation and detection from point cloud[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 770-779.
[25] QIAN Guocheng, LI Yuchen, PENG Houwen, et al. PointNeXt: revisiting PointNet++ with improved training and scaling strategies[EB/OL]. (2022-06-09)[2025-03-04]. https://arxiv.org/abs/2206.04670.
[26] YANG Zetong, SUN Yanan, LIU Shu, et al. 3DSSD: point-based 3D single stage object detector[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11037-11045.
[27] YIN Tianwei, ZHOU Xingyi, KRHENBUHL P. Center-based 3D Object Detection and Tracking[EB/OL]. (2020-06-19)[2025-03-04]. https://arxiv.org/abs/2006.11275.
[28] ABBAS W, SHABBIR M, LI Jiani, et al. Resilient distributed vector consensus using centerpoint[J]. Automatica, 2022, 136: 110046
[29] HU Yaoqi, NIU Axi, SUN Jinqiu, et al. Dynamic center point learning for multiple object tracking under Severe occlusions[J]. Knowledge-based systems, 2024, 300: 112130
[30] WANG Hai, TAO Le, CAI Yingfeng, et al. CenterPoint-SE: a single-stage anchor-free 3-D object detection algorithm with spatial awareness enhancement[J]. IEEE transactions on intelligent transportation systems, 2023, 24(10): 10760-10773
[31] 刘小波, 肖肖, 王凌, 等. 基于无锚框的目标检测方法及其在复杂场景下的应用进展[J]. 自动化学报, 2023, 49(7): 1369-1392 LIU Xiaobo, XIAO Xiao, WANG Ling, et al. Anchor-free based object detection methods and its application progress in complex scenes[J]. Acta automatica sinica, 2023, 49(7): 1369-1392
[32] CAESAR H, BANKITI V, LANG A H, et al. nuScenes: a multimodal dataset for autonomous driving[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11618-11628.
[33] BAI Xuyang, HU Zeyu, ZHU Xinge, et al. TransFusion: robust LiDAR-camera fusion for 3D object detection with transformers[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 1080-1089.
[34] WU Hai, WEN Chenglu, SHI Shaoshuai, et al. Virtual sparse convolution for multimodal 3D object detection[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 21653-21662.

相似文献/References:: [1]胡光龙,秦世引.动态成像条件下基于SURF和Mean shift的运动目标高精度检测[J].智能系统学报,2012,7(1):61.
　HU Guanglong,QIN Shiyin.High precision detection of a mobile object under dynamic imaging based on SURF and Mean shift[J].CAAI Transactions on Intelligent Systems,2012,7():61.
[2]韩峥,刘华平,黄文炳,等.基于Kinect的机械臂目标抓取[J].智能系统学报,2013,8(2):149.[doi:10.3969/j.issn.1673-4785.201212038]
　HAN Zheng,LIU Huaping,HUANG Wenbing,et al.Kinect-based object grasping by manipulator[J].CAAI Transactions on Intelligent Systems,2013,8():149.[doi:10.3969/j.issn.1673-4785.201212038]
[3]韩延彬,郭晓鹏,魏延文,等.RGB和HSI颜色空间的一种改进的阴影消除算法[J].智能系统学报,2015,10(5):769.[doi:10.11992/tis.201410010]
　HAN Yanbin,GUO Xiaopeng,WEI Yanwen,et al.An improved shadow removal algorithm based on RGB and HSI color spaces[J].CAAI Transactions on Intelligent Systems,2015,10():769.[doi:10.11992/tis.201410010]
[4]曾宪华,易荣辉,何姗姗.流形排序的交互式图像分割[J].智能系统学报,2016,11(1):117.[doi:10.11992/tis.201505037]
　ZENG Xianhua,YI Ronghui,HE Shanshan.Interactive image segmentation based on manifold ranking[J].CAAI Transactions on Intelligent Systems,2016,11():117.[doi:10.11992/tis.201505037]
[5]葛园园,许有疆,赵帅,等.自动驾驶场景下小且密集的交通标志检测[J].智能系统学报,2018,13(3):366.[doi:10.11992/tis.201706040]
　GE Yuanyuan,XU Youjiang,ZHAO Shuai,et al.Detection of small and dense traffic signs in self-driving scenarios[J].CAAI Transactions on Intelligent Systems,2018,13():366.[doi:10.11992/tis.201706040]
[6]莫宏伟,汪海波.基于Faster R-CNN的人体行为检测研究[J].智能系统学报,2018,13(6):967.[doi:10.11992/tis.201801025]
　MO Hongwei,WANG Haibo.Research on human behavior detection based on Faster R-CNN[J].CAAI Transactions on Intelligent Systems,2018,13():967.[doi:10.11992/tis.201801025]
[7]宁欣,李卫军,田伟娟,等.一种自适应模板更新的判别式KCF跟踪方法[J].智能系统学报,2019,14(1):121.[doi:10.11992/tis.201806038]
　NING Xin,LI Weijun,TIAN Weijuan,et al.Adaptive template update of discriminant KCF for visual tracking[J].CAAI Transactions on Intelligent Systems,2019,14():121.[doi:10.11992/tis.201806038]
[8]伍鹏瑛,张建明,彭建,等.多层卷积特征的真实场景下行人检测研究[J].智能系统学报,2019,14(2):306.[doi:10.11992/tis.201710019]
　WU Pengying,ZHANG Jianming,PENG Jian,et al.Research on pedestrian detection based on multi-layer convolution feature in real scene[J].CAAI Transactions on Intelligent Systems,2019,14():306.[doi:10.11992/tis.201710019]
[9]刘召,张黎明,耿美晓,等.基于改进的Faster R-CNN高压线缆目标检测方法[J].智能系统学报,2019,14(4):627.[doi:10.11992/tis.201905026]
　LIU Zhao,ZHANG Liming,GENG Meixiao,et al.Object detection of high-voltage cable based on improved Faster R-CNN[J].CAAI Transactions on Intelligent Systems,2019,14():627.[doi:10.11992/tis.201905026]
[10]单义,杨金福,武随烁,等.基于跳跃连接金字塔模型的小目标检测[J].智能系统学报,2019,14(6):1144.[doi:10.11992/tis.201905041]
　SHAN Yi,YANG Jinfu,WU Suishuo,et al.Skip feature pyramid network with a global receptive field for small object detection[J].CAAI Transactions on Intelligent Systems,2019,14():1144.[doi:10.11992/tis.201905041]
[11]陆军,李杨,鲁林超.远距离和遮挡下三维目标检测算法研究[J].智能系统学报,2024,19(2):259.[doi:10.11992/tis.202301001]
　LU Jun,LI Yang,LU Linchao.Long-distance and occluded 3D target detection algorithm[J].CAAI Transactions on Intelligent Systems,2024,19():259.[doi:10.11992/tis.202301001]

备注/Memo

收稿日期:2025-3-4。
基金项目:黑龙江省自然科学基金项目(F201123).
作者简介:陆军，教授，博士生导师，博士，主要研究方向为计算机视觉、机器感知、机械臂控制。编写著作 5部，发表学术论文80余篇。E-mail：lujun0260@sina.com。;邹康成，硕士研究生，主要研究方向为三维目标检测、计算机视觉。E-mail：z127577@163.com。;李杨，硕士研究生，主要研究方向为点云目标检测，跟踪，机器视觉，图像处理。E-mail：liyang142857@126.com。
通讯作者:陆军. E-mail：lujun0260@sina.com

更新日期/Last Update: 2026-01-05

基于特征流的点云目标检测方法 PDF下载HTML

备注/Memo

基于特征流的点云目标检测方法

PDF下载 HTML