[1]鲁斌,杨振宇,孙洋,等.基于多通道交叉注意力融合的三维目标检测算法[J].智能系统学报,2024,19(4):885-897.[doi:10.11992/tis.202305029]
LU Bin,YANG Zhenyu,SUN Yang,et al.3D object detection algorithm with multi-channel cross attention fusion[J].CAAI Transactions on Intelligent Systems,2024,19(4):885-897.[doi:10.11992/tis.202305029]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
19
期数:
2024年第4期
页码:
885-897
栏目:
学术论文—机器感知与模式识别
出版日期:
2024-07-05
- Title:
-
3D object detection algorithm with multi-channel cross attention fusion
- 作者:
-
鲁斌1,2, 杨振宇1,2, 孙洋1,2, 刘亚伟1,2, 王明晗1,2
-
1. 华北电力大学 控制与计算机工程学院, 河北 保定 071000;
2. 华北电力大学 河北省能源电力知识计算重点实验室, 河北 保定 071000
- Author(s):
-
LU Bin1,2, YANG Zhenyu1,2, SUN Yang1,2, LIU Yawei1,2, WANG Minghan1,2
-
1. School of Control and Compute Engineering, North China Electric Power University, Baoding 071000 China;
2. Hebei Key Laboratory of Knowledge Computing for Energy & Power, North China Electric Power University, Baoding 071000, China
-
- 关键词:
-
三维点云; 自动驾驶; 激光雷达; 深度学习; 三维目标检测; 柱体素; 交叉注意力; 单阶段算法
- Keywords:
-
3D point cloud; autonomous driving; LiDAR; deep learning; 3D object detection; pillar; cross attention; single-stage algorithm
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202305029
- 摘要:
-
针对现有单阶段三维目标检测算法对点云下采样特征利用方式单一、特征对长程上下文信息的聚合程度无法满足算法性能提升需求的问题,本文提出了基于多通道交叉注意力融合的单阶段三维目标检测算法。首先,设计通道交叉注意力模块用于融合下采样特征,可基于交叉注意力机制在通道层面上增强多尺度特征对不同感受野下长程空间信息的表达能力;然后,提出级联特征激励模块,结合原始下采样特征对通道交叉注意力加权特征进行级联激励,提升算法对关键空间特征的学习能力。在公共自动驾驶数据集KITTI上进行了大量实验并与主流算法对比,本文算法作为单阶段目标检测算法,在车辆类别3个难度级别上的检测准确率分别为91.34%、79.85%和75.98%,较基线算法分别提升了4.83%、3.26%和3.32%。实验结果证明了本文算法及所提模块在三维目标检测任务上的有效性和先进性。
- Abstract:
-
To solve the problems that the existing single-stage 3D object detection algorithm utilizes point cloud downsampling features in a single way and the degree of aggregation of features for the long-range contextual information cannot meet the requirement of enhancing the algorithm performance, we propose a single-stage 3D object detection algorithm based on multi-channel cross attention fusion. First, the channel-wise cross attention module is designed to fuse the down sampled features, which can enhance the expression ability of multi-scale features for the long-range spatial information under different receptive field based on the cross attention mechanism. Then, a cascade feature excitation module is proposed to combine the original downsampling features to cascade channel-wise cross attention weighted features to enhance the algorithm’s learning ability for key spatial features. Extensive experiments were conducted on the public autonomous driving dataset KITTI and compared with mainstream algorithms. As a single-stage algorithm, the detection accuracy was 91.34%, 79.85% and 75.98% for the three difficulty levels of car categories, which were 4.83%, 3.26% and 3.32% better than the baseline algorithm. The experimental results demonstrate the effectiveness and advancement of the algorithm and the proposed modules for 3D object detection task.
备注/Memo
收稿日期:2023-05-16。
基金项目:河北省重点研发计划项目(20310103D);河北省在读研究生创新能力培养资助项目(CXZZBS2023153).
作者简介:鲁斌,教授,博士,博士生导师,CCF高级会员,主要研究方向为智能计算与计算机视觉、综合能源系统与大数据分析。E-mail:lubin@ncepu.edu.cn;杨振宇,博士研究生,主要研究方向为机器学习、计算机视觉。E-mail:yangzhenyu536@163.com;孙洋,博士研究生,主要研究方向为机器学习、计算机视觉。E-mail:bless2016@163.com
通讯作者:鲁斌. E-mail:lubin@ncepu.edu.cn
更新日期/Last Update:
1900-01-01