[1]王霞,左一凡.视觉SLAM研究进展[J].智能系统学报,2020,15(5):825-834.[doi:10.11992/tis.202004023]
 WANG Xia,ZUO Yifan.Advances in visual SLAM research[J].CAAI Transactions on Intelligent Systems,2020,15(5):825-834.[doi:10.11992/tis.202004023]
点击复制

视觉SLAM研究进展(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第15卷
期数:
2020年5期
页码:
825-834
栏目:
综述
出版日期:
2020-10-31

文章信息/Info

Title:
Advances in visual SLAM research
作者:
王霞 左一凡
北京理工大学 光电成像技术与系统教育部重点实验室,北京 100081
Author(s):
WANG Xia ZUO Yifan
Key Laboratory of Photo-electronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing 100081, China
关键词:
视觉同步定位与创建地图稀疏视觉SLAM半稠密视觉SLAM稠密视觉SLAM视觉传感器优化视觉SLAM系统度量地图
Keywords:
visual simultaneous localization and mappingsparse visual SLAMSemiDense visual SLAMdense visual SLAMvisual sensorsoptimizationvisual SLAM systemmetric map
分类号:
TP391
DOI:
10.11992/tis.202004023
文献标志码:
A
摘要:
视觉SLAM是指相机作为传感器进行自身定位同步创建环境地图。SLAM在机器人、无人机和无人车导航中具有重要作用,定位精度会影响避障精度,地图构建质量直接影响后续路径规划等算法的性能,是智能移动体应用的核心算法。本文介绍主流的视觉SLAM系统架构,包括几种最常见的视觉传感器,以及前端的功能和基于优化的后端。并根据视觉SLAM系统的度量地图的种类不同将视觉SLAM分为稀疏视觉SLAM、半稠密视觉SLAM和稠密视觉SLAM 3种,分别介绍其标志性成果和研究进展,提出视觉SLAM目前存在的问题以及未来可能的发展。
Abstract:
Visual SLAM, i.e., simultaneous localization and mapping with cameras, plays an important role in the navigation of robots, unmanned aerial vehicles, and unmanned vehicles. As the location accuracy affects the obstacle avoidance accuracy and the mapping quality directly affects the path planning performance, the visual SLAM algorithm is the core aspect of intelligent mobile applications. This paper introduces the architecture of the mainstream visual SLAM system, including several common visual sensors, the function of the front end, and the optimized back end. According to the type of the metric map model created by the visual SLAM system, visual SLAM can be classified into three types: sparse visual SLAM, semi-dense visual SLAM, and dense visual SLAM. The landmark achievements and research progress of visual SLAM are reviewed in this paper, and its current problems and possible future developments are discussed.

参考文献/References:

[1] LEONARD J J, DURRANT-WHYTE H F. Simultaneous map building and localization for an autonomous mobile robot[C]//Proceedings IROS’91: IEEE/RSJ International Workshop on Intelligent Robots and Systems’ 91. Osaka, Japan, 1991: 1442-1447.
[2] SMITH R, SELF M, CHEESEMAN P. Estimating uncertain spatial relationships in robotics[M]//COX I J, WILFONG G Y. Autonomous Robot Vehicles. New York, USA: Springer, 1990: 167-193.
[3] MUR-ARTAL R, MONTIEL J M M, TARDOS J D. ORB-SLAM: a versatile and accurate monocular SLAM system[J]. IEEE transactions on robotics, 2015, 31(5): 1147-1163.
[4] QIN Tong, LI Peiliang, SHEN Shaojie. VINS-MONO: a robust and versatile monocular visual-inertial state estimator[J]. IEEE transactions on robotics, 2018, 34(4): 1004-1020.
[5] KLEIN G, MURRAY D. Parallel tracking and mapping on a camera phone[C]//Proceedings of the 2009 8th IEEE International Symposium on Mixed and Augmented Reality. Orlando, USA, 2009: 83-86.
[6] K?HLER O, PRISACARIU V A, REN C Y, et al. Very high frame rate volumetric integration of depth images on mobile devices[J]. IEEE transactions on visualization and computer graphics, 2015, 21(11): 1241-1250.
[7] LYNEN S, SATTLER T, BOSSE M, et al. Get out of my lab: large-scale, real-time visual-inertial localization[C]//Proceedings of Robotics: Science and Systems. Rome, Italy, 2015.
[8] 高翔, 张涛, 刘毅, 等. 视觉SLAM十四讲[M]. 北京: 电子工业出版社, 2017: 13-19.
[9] TAKETOMI T, UCHIYAMA H, IKEDA S. Visual slam algorithms: a survey from 2010 to 2016[J]. IPSJ transactions on computer vision and applications, 2017, 9(1): 16.
[10] CADENA C, CARLONE L, CARRILLO H, et al. Past, present, and future of simultaneous localization and mapping: toward the robust-perception age[J]. IEEE transactions on robotics, 2016, 32(6): 1309-1332.
[11] HUANG Baichuan, ZHAO Jun, LIU Jingbin. A survey of simultaneous localization and mapping with an envision in 6G wireless networks[EB/OL]. (2020-02-14)[2020-03-20]. https://arxiv.org/pdf/1909.05214.pdf.
[12] 刘浩敏, 章国锋, 鲍虎军. 基于单目视觉的同时定位与地图构建方法综述[J]. 计算机辅助设计与图形学学报, 2016, 28(6): 855-868
LIU Haomin, ZHANG Guofeng, BAO Hujun. A survey of monocular simultaneous localization and mapping[J]. Journal of computer-aided design & computer graphics, 2016, 28(6): 855-868
[13] GALLEGO G, DELBRUCK T, ORCHARD G, et al. Event-based vision: a survey[J]. arXiv: 1904.08405, 2019.
[14] LICHTSTEINER P, POSCH C, DELBRUCK T. A 128×128 120 dB 15 μs latency asynchronous temporal contrast vision sensor[J]. IEEE journal of solid-state circuits, 2008, 43(2): 566-576.
[15] SON B, SUH Y, KIM S, et al. 4.1 A 640×480 dynamic vision sensor with a 9μm pixel and 300meps address-event representation[C]//Proceedings of 2017 IEEE International Solid-State Circuits Conference. San Francisco, USA, 2017: 66-67.
[16] POSCH C, MATOLIN D, WOHLGENANNT R, et al. A microbolometer asynchronous dynamic vision sensor for LWIR[J]. IEEE sensors journal, 2009, 9(6): 654-664.
[17] HOFST?TTER M, SCH?N P, POSCH C. A SPARC-compatible general purpose address-event processor with 20-bit l0ns-resolution asynchronous sensor data interface in 0.18 μm CMOS[C]//Proceedings of 2010 IEEE International Symposium on Circuits and Systems. Paris, France, 2010: 4229-4232.
[18] POSCH C, HOFSTATTER M, MATOLIN D, et al. A dual-line optical transient sensor with on-chip precision time-stamp generation[C]//Proceedings of 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. San Francisco, USA, 2007: 500-618.
[19] BRANDLI C, BERNER R, YANG Minhao, et al. A 240×180 130 dB 3 μs latency global shutter spatiotemporal vision sensor[J]. IEEE journal of solid-state circuits, 2014, 49(10): 2333-2341.
[20] POSCH C, MATOLIN D, WOHLGENANNT R. A QVGA 143 dB dynamic range frame-free PWM image sensor with lossless pixel-level video compression and time-domain CDS[J]. IEEE journal of solid-state circuits, 2011, 46(1): 259-275.
[21] BAILEY T, DURRANT-WHYTE H. Simultaneous Localization and Mapping (SLAM): Part II[J]. IEEE robotics & automation magazine, 2006, 13(3): 108-117.
[22] DURRANT-WHYTE H, BAILEY T. Simultaneous localization and mapping: Part I[J]. IEEE robotics & automation magazine, 2006, 13(2): 99-110.
[23] DAVISON A J, REID I D, MOLTON N D, et al. MonoSLAM: real-time single camera SLAM[J]. IEEE transactions on pattern analysis and machine intelligence, 2007, 29(6): 1052-1067.
[24] KLEIN G, MURRAY D. Parallel tracking and mapping for small AR workspaces[C]//Proceedings of the 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. Nara, Japan, 2007: 225-234.
[25] KLEIN G, MURRAY D. Improving the agility of keyframe-based SLAM[C]//Proceedings of the 10th European Conference on Computer Vision. Marseille, France, 2008: 802-815.
[26] RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB: an efficient alternative to SIFT or SURF[C]//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain, 2011: 2564-2571.
[27] MUR-ARTAL R, TARDóS J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE transactions on robotics, 2017, 33(5): 1255-1262.
[28] FORSTER C, ZHANG Zichao, GASSNER M, et al. SVO: semidirect visual odometry for monocular and multicamera systems[J]. IEEE transactions on robotics, 2017, 33(2): 249-265.
[29] LOO S Y, AMIRI A J, MASHOHOR S, et al. CNN-SVO: improving the mapping in semi-direct visual odometry using single-image depth prediction[EB/OL]. (2018-10-01)[2020-02-03]. https://arxiv.org/abs/1810.01011.
[30] ZHANG Guofeng, LIU Haomin, DONG Zilong, et al. Efficient non-consecutive feature tracking for robust structure-from-motion[J]. IEEE transactions on image processing, 2016, 25(12): 5957-5970.
[31] ENGEL J, KOLTUN V, CREMERS D. Direct sparse odometry[J]. IEEE transactions on pattern analysis and machine intelligence, 2018, 40(3): 611-625.
[32] SCHLEGEL D, COLOSI M, GRISETTI G. ProSLAM: graph SLAM from a programmer’s perspective[EB/OL]. (2017-09-13)[2020-02-04]. https://arxiv.org/abs/1709.04377.
[33] SUMIKURA S, SHIBUYA M, SAKURADA K. Openvslam: a versatile visual slam framework[C]//Proceedings of the 27th ACM International Conference on Multimedia. Nice, France, 2019.
[34] PFROMMER B, DANIILIDIS K. TagSLAM: robust slam with fiducial markers[EB/OL]. (2019-10-01)[2020-02-05]. https://arxiv.org/abs/1910.00679.
[35] MU?OZ-SALINAS R, MEDINA-CARNICER R. UcoSLAM: simultaneous localization and mapping by fusion of keypoints and squared planar markers[J]. Pattern recognition, 2020, 101: 107193.
[36] ENGEL J, SCH?PS T, CREMERS D. LSD-SLAM: large-scale direct monocular SLAM[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland, 2014.
[37] ENGEL J, STüCKLER J, CREMERS D. Large-scale direct SLAM with stereo cameras[C]//Proceedings of 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg, Germany, 2015: 1935-1942.
[38] CARUSO D, ENGEL J, CREMERS D. Large-scale direct SLAM for omnidirectional cameras[C]//Proceedings of 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg, Germany, 2015: 141-148.
[39] WEIKERSDORFER D, HOFFMANN R, CONRADT J. Simultaneous localization and mapping for event-based vision systems[C]//Proceedings of the 9th International Conference on Computer Vision Systems. Petersburg, Russia, 2013: 133-142.
[40] WEIKERSDORFER D, ADRIAN D B, CREMERS D, et al. Event-based 3D SLAM with a depth-augmented dynamic vision sensor[C]//Proceedings of 2014 IEEE International Conference on Robotics and Automation. Hong Kong, China, 2014: 359-364.
[41] REBECQ H, HORSTSCHAEFER T, GALLEGO G, et al. EVO: a geometric approach to event-based 6-DOF parallel tracking and mapping in real time[J]. IEEE robotics and automation letters, 2017, 2(2): 593-600.
[42] ZHOU Yi, GALLEGO G, REBECQ H, et al. Semi-dense 3D reconstruction with a stereo event camera[C]//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany, 2018: 242-258.
[43] NEWCOMBE R A, LOVEGROVE S J, DAVISON A J. DTAM: dense tracking and mapping in real-time[C]//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain, 2011: 2320-2327.
[44] NEWCOMBE R A, IZADI S, HILLIGES O, et al. KinectFusion: real-time dense surface mapping and tracking[C]//Proceedings of the 2011 10th IEEE International Symposium on Mixed and Augmented Reality. Basel, Switzerland, 2011: 127-136.
[45] IZADI S, KIM D, HILLIGES O, et al. KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera[C]//Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology. Santa Barbara, USA, 2011: 559-568.
[46] WHELAN T, KAESS M, FALLON M, et al. Kintinuous: spatially extended kinectfusion[C]//Proceedings of RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras. Sydney, Australia, 2012.
[47] WHELAN T, JOHANNSSON H, KAESS M, et al. Robust real-time visual odometry for dense RGB-D mapping[C]//Proceedings of 2013 IEEE International Conference on Robotics and Automation. Karlsruhe, Germany, 2013: 5724-5731.
[48] WHELAN T, KAESS M, JOHANNSSON H, et al. Real-time large-scale dense RGB-D SLAM with volumetric fusion[J]. The international journal of robotics research, 2015, 34(4/5): 598-626.
[49] LABBé M, MICHAUD F. Memory management for real-time appearance-based loop closure detection[C]//Proceedings of 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. San Francisco, USA, 2011: 1271-1276.
[50] LABBé M M, MICHAUD F. Appearance-based loop closure detection for online large-scale and long-term operation[J]. IEEE transactions on robotics, 2013, 29(3): 734-745.
[51] LABBé M, MICHAUD F. Online global loop closure detection for large-scale multi-session graph-based slam[C]//Proceedings of 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. Chicago, USA, 2014: 2661-2666.
[52] LABBé M, MICHAUD F. RTAB-Map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation[J]. Journal of field robotics, 2019, 36(2): 416-446.
[53] KERL C, STURM J, CREMERS D. Dense visual SLAM for RGB-D cameras[C]//Proceedings of 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan, 2013: 2100-2106.
[54] KERL C, STURM J, CREMERS D. Robust odometry estimation for RGB-D cameras[C]//Proceedings of 2013 IEEE International Conference on Robotics and Automation. Karlsruhe, Germany, 2013: 3748-3754.
[55] NEWCOMBE R A, FOX D, SEITZ S M. Dynamicfusion: reconstruction and tracking of non-rigid scenes in real-time[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015: 343-352.
[56] INNMANN M, ZOLLH?FER M, NIE?NER M, et al. Volumedeform: real-time volumetric non-rigid reconstruction[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands, 2016: 362-379.
[57] DOU Mingsong, KHAMIS S, DEGTYAREV Y, et al. Fusion4D: real-time performance capture of challenging scenes[J]. ACM transactions on graphics, 2016, 35(4): 114.
[58] WHELAN T, LEUTENEGGER S, SALAS MORENO R, et al. Elasticfusion: dense SLAM without a pose graph[C]//Proceedings of Robotics: Science and Systems. Rome, Italy, 2015.
[59] WHELAN T, SALAS-MORENO R F, GLOCKER B, et al. ElasticFusion: real-time dense SLAM and light source estimation[J]. The international journal of robotics research, 2016, 35(14): 1697-1716.
[60] K?HLER O, PRISACARIU V A, MURRAY D W. Real-time large-scale dense 3D reconstruction with loop closure[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands, 2016: 500-516.
[61] PRISACARIU V A, K?HLER O, GOLODETZ S, et al. InfiniTAM v3: a framework for large-scale 3D reconstruction with loop closure[EB/OL]. (2017-08-02)[2020-02-25]. http://arxiv.org/abs/1708.00783.
[62] ENDRES F, HESS J, STURM J, et al. 3-D mapping with an RGB-D camera[J]. IEEE transactions on robotics, 2014, 30(1): 177-187.
[63] GREENE W N, OK K, LOMMEL P, et al. Multi-level mapping: real-time dense monocular SLAM[C]//Proceedings of 2016 IEEE International Conference on Robotics and Automation. Stockholm, Sweden, 2016: 833-840.
[64] SMITH R C, CHEESEMAN P. On the representation and estimation of spatial uncertainty[J]. The international journal of robotics research, 1986, 5(4): 56-68.
[65] SUALEH M, KIM G W. Simultaneous localization and mapping in the epoch of semantics: a survey[J]. International journal of control, automation and systems, 2019, 17(3): 729-742.
[66] GOMEZ-OJEDA R, MORENO F A, ZU?IGA-NO?L D, et al. PL-SLAM: a stereo SLAM system through the combination of points and line segments[J]. IEEE transactions on robotics, 2019, 35(3): 734-746.
[67] ZHOU Huizhong, ZOU Danping, PEI Ling, et al. StructSLAM: visual SLAM with building structure lines[J]. IEEE transactions on vehicular technology, 2015, 64(4): 1364-1375.
[68] ATANASOV N, BOWMAN S L, DANIILIDIS K, et al. A unifying view of geometry, semantics, and data association in SLAM[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden, 2018: 5204-5208.

相似文献/References:

[1]权美香,朴松昊,李国.视觉SLAM综述[J].智能系统学报,2016,11(6):768.[doi:10.11992/tis.201607026]
 QUAN Meixiang,PIAO Songhao,LI Guo.An overview of visual SLAM[J].CAAI Transactions on Intelligent Systems,2016,11(5):768.[doi:10.11992/tis.201607026]

备注/Memo

备注/Memo:
收稿日期:2020-04-23。
基金项目:装备预先研究项目(41417070401)
作者简介:王霞,副教授,博士生导师,光电成像与信息工程研究所副所长,主要研究方向为光电成像技术和光电检测技术。主持省部级以上项目和横向合作项目多项。获授权国家/国防发明专利10余项,研究成果获省级技术发明二等奖1项、科技进步三等奖3项、中国电子科技集团公司科技进步三等奖1项。编辑出版教材2部,发表学术论文70余篇;左一凡,博士研究生,主要研究方向为视觉SLAM、多传感器融合导航
通讯作者:左一凡.E-mail:zuoyifan_bit@outlook.com
更新日期/Last Update: 2021-01-15