[1]LU Jun,LU Linchao,ZHAI Xiaoyang,et al.High-efficiency 3D object detection for road traffic scenes[J].CAAI Transactions on Intelligent Systems,2025,20(1):91-100.[doi:10.11992/tis.202311013]
Copy

High-efficiency 3D object detection for road traffic scenes

References:
[1] HUANG Keli, SHI Botian, LI Xiang, et al. Multi-modal sensor fusion for auto driving perception: a survey[EB/OL]. (2022-02-06)[2023-11-13]. https://arxiv.org/abs/2202.02703.
[2] 刘通, 高思洁, 聂为之. 基于多模态信息融合的多目标检测算法[J]. 激光与光电子学进展, 2022, 59(8): 339-348.
LIU Tong, GAO Sijie, NIE Weizhi. Multitarget detection algorithm based on multimodal information fusion[J]. Laser & optoelectronics progress, 2022, 59(8): 339-348.
[3] SONG Ziying, LIU Lin, JIA Feiyang, et al. Robustness-aware 3D object detection in autonomous driving: a review and outlook[EB/OL]. (2024-01-12)[2024-08-02]. http://arxiv.org/abs/2401.06542v3.
[4] VORA S, LANG A H, HELOU B, et al. PointPainting: sequential fusion for 3D object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 4603-4611.
[5] WANG Chunwei, MA Chao, ZHU Ming, et al. PointAugmenting: cross-modal augmentation for 3D object detection[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 11789-11798.
[6] XU Shaoqing, ZHOU Dingfu, FANG Jin, et al. FusionPainting: multimodal fusion with adaptive attention for 3D object detection[C]//2021 IEEE International Intelligent Transportation Systems Conference. Indianapolis: IEEE, 2021: 3047-3054.
[7] BAI Xuyang, HU Zeyu, ZHU Xinge, et al. TransFusion: robust LiDAR-camera fusion for 3D object detection with transformers[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 1080-1089.
[8] LIANG Tingting, XIE Hongwei, YU Kaicheng, et al. Bevfusion: A simple and robust lidar-camera fusion framework[J]. Advances in neural information processing systems, 2022, 35: 10421-10434.
[9] LI Yingwei, YU A W, MENG Tianjian, et al. DeepFusion: lidar-camera deep fusion for multi-modal 3D object detection[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 17161-17170.
[10] HU Haotian, WANG Fanyi, SU Jingwen, et al. EA-BEV: edge-aware bird’s-eye-view projector for 3D object detection[EB/OL]. (2023-03-31)[2023-11-13]. https://arxiv.org/abs/2303.17895.
[11] YAN Junjie, LIU Yingfei, SUN Jianjian, et al. Cross modal transformer via coordinates encoding for 3D object dectection[EB/OL]. (2023-01-03)[2023-11-13]. https://arxiv.org/abs/2301.01283.
[12] WANG Haiyang, TANG Hao, SHI Shaoshuai, et al. UniTR: a unified and efficient multi-modal transformer for bird’s-eye-view representation[C]//2023 IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 6792-6802.
[13] 张新钰, 邹镇洪, 李志伟, 等. 面向自动驾驶目标检测的深度多模态融合技术[J]. 智能系统学报, 2020, 15(4): 758-771.
ZHANG Xinyu, ZOU Zhenhong, LI Zhiwei, et al. Deep multi-modal fusion in object detection for autonomous driving[J]. CAAI transactions on intelligent systems, 2020, 15(4): 758-771.
[14] 鲁斌, 杨振宇, 孙洋, 等. 基于多通道交叉注意力融合的三维目标检测算法[J]. 智能系统学报, 2024, 19(4): 885-897.
LU Bin, YANG Zhenyu, SUN Yang, et al. 3D object detection algorithm with multi-channel cross attention fusion[J]. CAAI transactions on intelligent systems, 2024, 19(4): 885-897.
[15] YAN Yan, MAO Yuxing, LI Bo. SECOND: sparsely embedded convolutional detection[J]. Sensors, 2018, 18(10): 3337.
[16] LANG A H, VORA S, CAESAR H, et al. PointPillars: fast encoders for object detection from point clouds[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 12689-12697.
[17] YANG Zetong, ZHOU Yin, CHEN Zhifeng, et al. 3D-MAN: 3D multi-frame attention network for object detection[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 1863-1872.
[18] ZHOU Yin, TUZEL O. VoxelNet: end-to-end learning for point cloud based 3D object detection[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018.
[19] YANG Zetong, SUN Yanan, LIU Shu, et al. STD: sparse-to-dense 3D object detector for point cloud[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 1951-1960.
[20] YANG Zetong, SUN Yanan, LIU Shu, et al. 3DSSD: point-based 3D single stage object detector[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11037-11045.
[21] SHI Shaoshuai, GUO Chaoxu, JIANG Li, et al. PV-RCNN: point-voxel feature set abstraction for 3D object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10526-10535.
[22] SHI Shaoshuai, WANG Xiaogang, LI Hongsheng. PointRCNN: 3D object proposal generation and detection from point cloud[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 770-779.
[23] LI Bo, YAN Junjie, WU Wei, et al. High performance visual tracking with siamese region proposal network[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8971-8980.
[24] QI C R, YI Li, SU Hao, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[EB/OL]. (2017-06-07)[2023-11-13]. http://arxiv.org/abs/1706.02413v1.
[25] JIANG Yingying, ZHU Xiangyu, WANG Xiaobing, et al. R2CNN: rotational region CNN for orientation robust scene text detection[EB/OL]. (2017-06-29)[2023-11-13]. https://arxiv.org/abs/1706.09579.
[26] HOU Yi, ZHANG Hong, ZHOU Shilin, et al. Efficient ConvNet feature extraction with multiple RoI pooling for landmark-based visual localization of autonomous vehicles[J]. Mobile information systems, 2017: 8104386.
[27] LI Yangyan, BU Rui, SUN Mingchao, et al. Pointcnn: convolution on X-transformed points[J]. Advances in neural information processing systems, 2018, 31: 820-830.
[28] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999-3007.
[29] WANG Hua, NIE Feiping, HUANG Heng. Robust distance metric learning via simultaneous l1-norm minimization and maximization[C]//Proceedings of the 31st International Conference on Machine Learning. Beijing: PMLR, 2014: 1836-1844.
[30] GROH F, WIESCHOLLEK P, LENSCH H P A. Flex-convolution[C]//Asian Conference on Computer Vision. Cham: Springer, 2018: 105-122.
[31] DOVRAT O, LANG I, AVIDAN S. Learning to sample[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2755-2764.
[32] YANG Jiancheng, ZHANG Qiang, NI Bingbing, et al. Modeling point clouds with self-attention and gumbel subset sampling[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3318-3327.
[33] XU K, BA J L, KIROS R, et al. Show, attend and tell: Neural image caption generation with visual attention[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille: ICML, 2015: 2048-2057.
[34] BROWN R A. Building a balanced k-d tree in O(kn log n) time[EB/OL]. (2014-10-20)[2023-11-13]. http://arxiv.org/abs/1410.5420v46.
[35] CHEGN Xiaozhi, MA Huimin, WAN Ji, et al. Multi-view 3d object detection network for autonomous driving[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1907–1915.
Similar References:

Memo

-

Last Update: 2025-01-05

Copyright © CAAI Transactions on Intelligent Systems