[1]何国豪,翟涌,龚建伟,等.车载双目视觉动态级联修正实时立体匹配网络[J].智能系统学报,2022,17(6):1145-1153.[doi:10.11992/tis.202111013]
HE Guohao,ZHAI Yong,GONG Jianwei,et al.Real-time stereo matching network for vehicle binocular vision based on dynamic cascade correction[J].CAAI Transactions on Intelligent Systems,2022,17(6):1145-1153.[doi:10.11992/tis.202111013]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
17
期数:
2022年第6期
页码:
1145-1153
栏目:
学术论文—机器感知与模式识别
出版日期:
2022-11-05
- Title:
-
Real-time stereo matching network for vehicle binocular vision based on dynamic cascade correction
- 作者:
-
何国豪1, 翟涌1, 龚建伟1,2, 王羽纯1, 张曦2
-
1. 北京理工大学 机械与车辆学院,北京 100081;
2. 北京理工大学 重庆创新中心,重庆 401120
- Author(s):
-
HE Guohao1, ZHAI Yong1, GONG Jianwei1,2, WANG Yuchun1, ZHANG Xi2
-
1. School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China;
2. Chongqing Innovation Center, Beijing Institute of Technology, Chongqing 401120, China
-
- 关键词:
-
双目视觉; 深度学习; 立体匹配; 视差估计; 动态计算; 特征融合; 车载视觉
- Keywords:
-
binocular vision; deep learning; stereo matching; disparity estimation; dynamic computation; feature fusion; on-board vision
- 分类号:
-
TP29
- DOI:
-
10.11992/tis.202111013
- 文献标志码:
-
2022-09-23
- 摘要:
-
针对目前基于双目视觉的高精度立体匹配网络消耗计算资源多、运算时间长、无法用于智能驾驶系统实时导航的问题,本文提出了一种能够满足车载实时性和准确性要求的动态融合双目立体匹配深度学习网络。该网络加入了基于全局深度卷积的注意力模块完成特征提取,减少了网络层数与参数数量;通过动态代价级联融合、多尺度融合以及动态视差结果修复优化3D卷积计算,加速了常用的3D特征融合过程。将训练好的模型部署在车载硬件例如NVIDIA Jetson TX2上,并在公开的KITTI立体匹配数据集上进行测试。实验显示,该方法可以达到与目前公开在排行榜中最好方法相当的运行精度,3像素点误差小于6.58%,并且运行速度小于0.1 s/f,能够达到车载实时使用性能要求。
- Abstract:
-
Given the shortcoming of high-precision stereo matching networks based on binocular vision, such as high computing resource consumption, long operating time, and inability to be used in real-time navigation by intelligent driving systems, this study proposes a dynamic fusion stereo matching deep learning network that can meet real-time and accuracy requirements in vehicles. The network includes a global deep convolution-based attention module to complete feature extraction while reducing the number of network layers and parameters and optimizing 3D convolution calculations through dynamic cost cascade fusion, multi-scale fusion, and dynamic disparity change to accelerate the commonly used 3D feature fusion process. The trained model is tested on KITTI Stereo 2015 dataset using onboard hardware such as the NVIDIA Jetson TX2. Experiments show that the method can achieve the same operating accuracy as the state-of-the-art method currently in the leaderboard, 3 pixels error is less than 6.58%, and the operating duration is less than 0.1 seconds per frame, meeting real-time performance requirements.
备注/Memo
收稿日期:2021-11-06。
基金项目:国家自然科学基金项目(U19A2083,61703041).
作者简介:何国豪,硕士研究生,主要研究方向为智能驾驶、智能系统视觉感知;翟涌,副教授,主要研究方向为车辆电子控制。获得授权发明专利5项,发表学术论文10篇;龚建伟,教授,汽车研究所长,主要研究方向为地面无人平台相关技术。主持国家级或省部级项目10余项,授权发明专利30项。发表学术论文13篇,参编专著和教材5部
通讯作者:龚建伟.E-mail:gongjianwei@bit.edu.cn
更新日期/Last Update:
1900-01-01