[1]权美香,朴松昊,李国.视觉SLAM综述[J].智能系统学报,2016,11(6):768-776.[doi:10.11992/tis.201607026]
 QUAN Meixiang,PIAO Songhao,LI Guo.An overview of visual SLAM[J].CAAI Transactions on Intelligent Systems,2016,11(6):768-776.[doi:10.11992/tis.201607026]
点击复制

视觉SLAM综述(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第11卷
期数:
2016年6期
页码:
768-776
栏目:
出版日期:
2017-01-20

文章信息/Info

Title:
An overview of visual SLAM
作者:
权美香1 朴松昊12 李国1
1. 哈尔滨工业大学 多智能体机器人实验室, 黑龙江 哈尔滨 150000;
2. 哈尔滨工业大学 多智能体机器人实验室, 黑龙江 哈尔滨 150000
Author(s):
QUAN Meixiang1 PIAO Songhao12 LI Guo1
1. Multi-agent Robot Research Center, Harbin Institute of Technology, Harbin 150000, China;
2. Multi-agent Robot Research Center, Harbin Institute of Technology, Harbin 150000, China
关键词:
视觉同步定位与创建地图单目视觉RGB_D SLAM特征检测与匹配闭环检测
Keywords:
visual simultaneous localization and mappingmonocular visionRGB_Dfeature detection and matchingloop closure detection
分类号:
TP391
DOI:
10.11992/tis.201607026
摘要:
视觉SLAM指的是相机作为唯一的外部传感器,在进行自身定位的同时创建环境地图。SLAM创建的地图的好坏对之后自主的定位、路径规划以及壁障的性能起到一个决定性的作用。本文对基于特征的视觉SLAM方法和直接的SLAM方法,视觉SLAM的主要标志性成果,SLAM的主要研究实验室进行了介绍,并介绍了SIFT,SURF,ORB特征的检测与匹配,关键帧选择方法,并对消除累积误差的闭环检测及地图优化的方法进行了总结。最后,对视觉SLAM的主要发展趋势及研究热点进行了讨论,并对单目视觉SLAM,双目视觉SLAM,RGB_D SLAM进行了优缺点分析。
Abstract:
Visual SLAM refers to simultaneously localizing itself and reconstructing the environment map using cameras as the only external sensor. The quality of the map created by SLAM plays a decisive role in the performance of the subsequent automatic localization, path planning, and obstacle avoidance. This paper introduced the feature-based visual SLAM and direct visual SLAM methods; the major symbolic achievement of visual SLAM; the main research laboratory of SLAM; and the method of SIFT, SURF, and ORB feature detection and matching, key frame selection. In addition, this paper summarized the loop closure detection and map optimization that removed the accumulated error. In the end, the development tendency and research highlights of SLAM were discussed and the advantages and disadvantages of monocular SLAM, binocular SLAM, and RGB_D SLAM were analyzed.

参考文献/References:

[1] DAVISON A J. SLAM with a single camera[C]//Proceedings of Workshop on Concurrent Mapping an Localization for Autonomous Mobile Robots in Conjunction with ICRA. Washington, DC, USA, 2002:18-27.
[2] DAVISON A J. Real-time simultaneous localisation and mapping with a single camera[C]//Proceedings of the Ninth IEEE International Conference on Computer Vision. Washington, DC, USA, 2003:1403-1410.
[3] DAVISON A J, REID I D, MOLTON N D, et al. MonoSLAM:real-time single camera SLAM[J]. IEEE transactions on pattern analysis and machine intelligence, 2007, 29(6):1052-1067.
[4] CIVERA J, DAVISON A J, MONTIEL J M M. Inverse depth parametrization for monocular SLAM[J]. IEEE transactions on robotics, 2008, 24(5):932-945.
[5] MARTINEZ-CANTIN R, CASTELLANOS J A. Unscented SLAM for large-scale outdoor environments[C]//Proceedings of 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems. Edmonton, Alberta, Canada, 2005:3427-3432
[6] CHEKHLOV D, PUPILLI M, MAYOL-CUEVAS W, et al. Real-time and robust monocular SLAM using predictive multi-resolution descriptors[C]//Proceedings of the Second International Conference on Advances in Visual Computi. Lake Tahoe, USA, 2006:276-285.
[7] HOLMES S, KLEIN G, MURRAY D W. A square root unscented kalman filter for visual monoSLAM[C]//Proceedings of 2008 International Conference on Robotics and Automation, ICRA. Pasadena, California, USA, 2008:3710-3716.
[8] SIM R, ELINAS P, GRIFFIN M, et al. Vision-based SLAM using the Rao-Blackwellised particle filter[J]. IJCAI workshop on reasoning with uncertainty in robotics, 2005, 9(4):500-509.
[9] LI Maohai, HONG Bingrong, CAI Zesu, et al. Novel Rao-Blackwellized particle filter for mobile robot SLAM using monocular vision[J]. International journal of intelligent technology, 2006, 1(1):63-69.
[10] KLEIN G, MURRAY D. Parallel Tracking and Mapping for Small AR Workspaces[C]//IEEE and ACM International Symposium on Mixed and Augmented Reality. Nara, Japan, 2007:225-234.
[11] KLEIN G, MURRAY D. Improving the agility of keyframe-based SLAM[C]//European Conference on Computer Vision. Marseille, France, 2008:802-815.
[12] MUR-ARTAL R, TARDÓS J D. Fast relocalisation and loop closing in keyframe-based SLAM[C]//IEEE International Conference on Robotics and Automation. New Orleans, LA, 2014:846-853.
[13] MUR-ARTAL R, MONTIEL J M M, TARDOS J D. ORB-SLAM:A Versatile and Accurate Monocular SLAM System[J]. IEEE transactions on robotics, 2015, 31(5):1147-1163.
[14] KHOSHELHAM K, ELBERINK S O. Accuracy and resolution of Kinect depth data for indoor mapping applications[J]. Sensors, 2012, 12(2):1437-1454.
[15] HOGMAN V. Building a 3-D Map from RGB-D sensors[D]. Stockholm, Sweden:Royal Institute of Technology, 2012.
[16] HENRY P, KRAININ M, HERBST E, et al. RGB-D mapping:Using depth cameras for dense 3-D modeling of indoor environments[C]//12th International Symposium on Experimental Robotics. Berlin, Germany, 2014:477-491.
[17] DRYANOVSKI I, VALENTI R G, XIAO J Z. Fast visual odometry and mapping from RGB-D data[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA, 2013:2305-2310.
[18] HENRY P, KRAININ M, HERBST E, et al. RGB-D mapping:using depth cameras for dense 3-D modeling of indoor environments[M]//KHATIB O, KUMAR V, PAPPAS G J. Experimental Robotics. Berlin Heidelberg:Springer, 2014:647-663.
[19] HENRY P, KRAININ M, HERBST E, et al. RGB-D Mapping:Using Depth Cameras for Dense 3-D Modeling of Indoor Environments[C]//12th International Symposium on Experimental Robotics. Berlin Germany, 2014:477-491.
[20] HENRY P, KRAININ M, HERBST E, et al. RGB-D mapping:Using Kinect-style depth cameras for dense 3-D modeling of indoor environments[J]. International journal of robotics research, 2012, 31(5):647-663.
[21] ENGELHARD N, ENDRES F, HESS J. Real-time 3-D visual SLAM with a hand-held RGB-D camera[C]//Proceeedings of the RGB-D workshop on 3-D Perception in Robotics at the European Robotics Forum. Våsterås, Sweden, 2011.
[22] STVHMER J, GUMHOLD S, CREMERS D. Real-time dense geometry from a handheld camera[C]//GOESELE M, ROTH S, KUIJPER A, et al. Pattern Recognition. Berlin Heidelberg:Springer, 2010:11-20.
[23] ENGEL J, STURM J, CREMERS D. Semi-Dense Visual Odometry for a Monocular Camera[C]//International Conference on Computer Vision. Sydney, NSW, 2013:1449-1456.
[24] NEWCOMBE R A, LOVEGROVE S J, DAVISON A J. DTAM:Dense tracking and mapping in real-time[C]//International Conference on Computer Vision. Barcelona, Spain, 2011:2320-2327.
[25] FORSTER C, PIZZOLI M, SCARAMUZZA D. SVO:Fast semi-direct monocular visual odometry[C]//2014 IEEE International Conference onRobotics and Automation. Hong Kong, China, 2014:15-22.
[26] ENGEL J, SCHÖPS T, CREMERS D. LSD-SLAM:Large-Scale Direct Monocular SLAM[M]//FLEET D, PAJDLA T, SCHIELE B, et al, eds. Computer Vision-ECCV 2014. Switzerland:Springer International Publishing, 2014:834-849.
[27] NEWCOMBE R A, IZADI S, HILLIGES O, et al. KinectFusion:Real-time dense surface mapping and tracking[C]//IEEE International Symposium on Mixed and Augmented Reality. Basel, Switzerland, 2011:127-136.
[28] GOKHOOL T, MEILLAND M, RIVES P, et al. A dense map building approach from spherical RGBD images[C]//International Conference on Computer Vision Theory and Applications. Lisbon, Portugal, 2014:1103-1114.
[29] KERL C, STURM J, CREMERS D. Dense visual SLAM for RGB-D cameras[C]//Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan, 2013:2100-2106.
[30] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60(2):91-110.
[31] BAY H, TUYTELAARS T, VAN GOOL L. SURF:speeded up robust features[M]//LEONARDIS A, BISCHOF H, PINZ A. Computer Vision-ECCV 2006. Berlin Heidelberg:Springer, 2006.
[32] RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB:An efficient alternative to SIFT or SURF[C]//International Conference on Computer Vision. Barcelona, Spain, 2011:2564-2571.
[33] ALI A M, JAN NORDIN M. SIFT based monocular SLAM with multi-clouds features for indoor navigation[C]//2010 IEEE Region 10 Conference TENCON. Fukuoka, 2010:2326-2331.
[34] WU E Y, ZHAO L K, GUO Y P, et al. Monocular vision SLAM based on key feature points selection[C]//2010 IEEE International Conference on Information and Automation (ICIA). Harbin, China, 2010:1741-1745.
[35] CHEN C H, CHAN Y P. SIFT-based monocluar SLAM with inverse depth parameterization for robot localization[C]//IEEE Workshop on Advanced Robotics and Its Social Impacts, 2007. Hsinchu, China, 2007:1-6.
[36] Zhu D X. Binocular Vision-SLAM Using Improved SIFT Algorithm[C]//20102nd International Workshop on Intelligent Systems and Applications (ISA). Wuhan, China, 2010:1-4.
[37] ZHANG Z Y, HUANG Y L, LI C, et al. Monocular vision simultaneous localization and mapping using SURF[C]//WCICA 2008. 7th World Congress on Intelligent Control and Automation. Chongqing, China, 2008:1651-1656.
[38] YE Y. The research of SLAM monocular vision based on the improved surf feather[C]//International Conference on Computational Intelligence and Communication Networks. Hongkong, China, 2014:344-348.
[39] WANG Y T, FENG Y C. Data association and map management for robot SLAM using local invariant features[C]//2013 IEEE International Conference on Mechatronics and Automation. Takamatsu, 2013.
[40] ROSTEN E, DRUMMOND T. Machine Learning for High-Speed Corner Detection[M]//LEONARDIS A, BISCHOF H, PINZ A, et al. European Conference on Computer Vision. Berlin Heidelberg:Springer, 2006:430-443.
[41] CALONDER M, LEPETIT V, STRECHA C, et al. BRIEF:Binary Robust Independent Elementary Features[C]//European Conference on Computer Vision. Crete, Greece, 2010:778-792.
[42] FEN X, ZHEN W. An embedded visual SLAM algorithm based on Kinect and ORB features[C]//201534th Chinese Control Conference. Hangzhou, China, 2015:6026-6031.
[43] XIN G X, ZHANG X T, WANG X, et al. A RGBD SLAM algorithm combining ORB with PROSAC for indoor mobile robot[C]//20154th International Conference on Computer Science and Network Technology (ICCSNT). Harbin, China, 2015:71-74.
[44] LI J, PAN T S, TSENG K K, et al. Design of a monocular simultaneous localisation and mapping system with ORB feature[C]//International Conference on Multimedia and Expo (ICME), San Jose, California, USA, 2013:1-4.
[45] EADE E, DRUMMOND T. Edge landmarks in monocular SLAM[J]. Image and vision computing, 2009, 27(5):588-596.
[46] KLEIN G, MURRAY D. Improving the agility of keyframe-based SLAM[C]//European Conference on Computer Vision. Marseille, France, 2008:802-815.
[47] CONCHA A, CIVERA J. Using superpixels in monocular SLAM[C]//IEEE International Conference on Robotics and Automation. New Orleans, LA, 2014:365-372.
[48] SCHINDLER G, BROWN M, SZELISKI R. City-Scale Location Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. Ezhou, China, 2007:1-7.
[49] ULRICH I, NOURBAKHSH I. Appearance-based place recognition for topological localization[C]//IEEE International Conference on Robotics and Automation. Anchorage, Alaska, 2010:1023-1029.
[50] NEIRA J, RIBEIRO M I, TARDOS J D. Mobile robot localization and map building using monocular vision[C]//Proceedings of the 5th International Symposium on Intelligent Robotic Systems. Pisa, Italy, 1997:275-284.
[51] WILLIAMS B, CUMMINS M, NEIRA J, et al. A comparison of loop closing techniques in monocular SLAM[J]. Robotics and autonomous systems, 2009, 57(12):1188-1197.
[52] MUR-ARTAL R, TARDOS J D. ORB-SLAM:Tracking and mapping recognizable features[C]//IEEE International Conference on Robotics and Automation (ICRA). Berkeley, CA, USA, 2014.
[53] CUMMINS M, NEWMAN P. Accelerated appearance-only SLAM[C]//IEEE International Conference on Robotics and Automation. Pasadena, California, USA, 2008:1828-1833.
[54] CLEMENTE L A, DAVISON A J, REID I D, et al. Mapping Large Loops with a Single Hand-Held Camera.[C]//Robotics:Science and Systems. Atlanta, GA, USA, 2007.
[55] CUMMINS M, NEWMAN P. FAB-MAP:Probabilistic Localization and Mapping in the Space of Appearance[J]. International Journal of Robotics Research, 2008, 27(6):647-665.
[56] NISTER D, STEWENIUS H. Scalable Recognition with a Vocabulary Tree[C]//2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06). New York, NY, USA, 2006:2161-2168.
[57] ANGELI A, FILLIAT D, DONCIEUX S, et al. A fast and incremental method for loop-closure detection using bags of visual words[J]. IEEE transactions on robotics, 2008, 24(5):1027-1037.
[58] CUMMINS M, NEWMAN P. Highly scalable appearance-only SLAM-FAB-MAP 2.0[C]//Robotics:Science and Systems V, University of Washington. Seattle, USA, 2009.
[59] GALVEZ-LÓPEZ D, TARDOS J D. Bags of binary words for fast place recognition in image sequences[J]. IEEE Transactions on robotics, 2012, 28(5):1188-1197.
[60] EADE E D, DRUMMOND T W. Unified loop closing and recovery for real time monocular SLAM[C]//British Machine Vision Conference. Leeds, UK, 2008:1-10.
[61] GÁLVEZ-LÓPEZ D, TARDOS J D. Real-time loop detection with bags of binary words[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. San Francisco, California, USA, 2011:51-58.
[62] CUMMINS M, NEWMAN P. Appearance-only SLAM at large scale with FAB-MAP 2.0[J]. International journal of robotics research, 2011, 30(9):1100-1123.
[63] LOWE D G. Distinctive Image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60(2):91-110.
[64] TRIGGS B, MCLAUCHLAN P F, HARTLEY R I, et al. Bundle Adjustment-A Modern Synthesis[M]//TRIGGS B, ZISSERMAN A, SZELISKI R. Vision Algorithms:Theory and Practice. Berlin Heidelberg:Springer, 2000:298-372.
[65] HARTLEY R, ZISSERMAN A. Multiple view geometry in computer vision[M]. 2nd ed. Cambridge U K:Cambridge University Press, 2003.
[66] KVMMERLE R, GRISETTI G, STRASDAT H. G2o:A general framework for graph optimization[C]//IEEE International Conference on Robotics and Automation. Shanghai, China, 2011:3607-3613.
[67] STRASDAT H, MONTIEL J M M, DAVISON A J. Scale drift-aware large scale monocular SLAM[C]//Proceedings of Robotics:Science and Systems. Zaragoza, Spain, 2010.
[68] STRASDAT H, DAVISON A J, MONTIEL J M M, et al. Double window optimisation for constant time visual SLAM[C]//International Conference on Computer Vision. Barcelona, Spain, 2011:2352-2359.
[69] MOURIKIS A I, ROUMELIOTIS S I. A multi-state constraint Kalman filter for vision-aided inertial navigation[C]//Proceedings of the 2007 IEEE International Conference on Robotics and Automation (ICRA). Roma, Italy, 2007.
[70] MOURIKIS A I, ROUMELIOTIS S I. A dual-layer estimator architecture for long-term localization[C]//Proceedings of the 2008 Workshop on Visual Localization for Mobile Platforms at CVPR. Anchorage, Alaska, 2008.
[71] LEUTENEGGER S, FURGALE P, RABAUD V, et al. Keyframe-based visual-inertial slam using nonlinear optimization[C]//Proceedings of 2013 Robotics:Science and Systems (RSS). Berlin, Germany, 2013.
[72] Google. Project tango. URL https://www.google.com/atap/projecttango/.
[73] ?BONTAR J, LE CUN Y. Stereo matching by training a convolutional neural network to compare image patches[J]. The journal of machine learning research, 2015, 17(1):2287-2318.
[74] SVNDERHAUF N, SHIRAZI S, DAYOUB F. On the performance of ConvNet features for place recognition[C]//2015 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS). Hamburg, Germany, 2015:4297-4304.
[75] COSTANTE G, MANCINI M, VALIGI P, et al. Exploring representation learning with CNNs for frame-to-frame ego-motion estimation[J]. IEEE robotics and automation letters, 2016, 1(1):18-25.
[76] KENDALL A, CIPOLLA R. Modelling uncertainty in deep learning for camera relocalization[C]//2016 IEEE International Conference on Robotics and Automation. Stockholm, Sweden, 2016:4762-4769.

相似文献/References:

[1]顾照鹏,刘宏.单目视觉同步定位与地图创建方法综述[J].智能系统学报,2015,10(04):499.[doi:10.3969/j.issn.1673-4785.201503003]
 GU Zhaopeng,LIU Hong.A survey of monocular simultaneous localization and mapping[J].CAAI Transactions on Intelligent Systems,2015,10(6):499.[doi:10.3969/j.issn.1673-4785.201503003]
[2]蒲兴成,谭少峰,张毅.基于改进FAST算法的移动机器人导航[J].智能系统学报,2014,9(04):419.[doi:10.3969/j.issn.1673-4785.201305076]
 PU Xingcheng,TAN Shaofeng,ZHANG Yi.Research on the navigation of mobile robots based on the improved FAST algorithm[J].CAAI Transactions on Intelligent Systems,2014,9(6):419.[doi:10.3969/j.issn.1673-4785.201305076]

备注/Memo

备注/Memo:
收稿日期:2016-07-25。
基金项目:国家自然科学基金面上项目(61375081).
作者简介:权美香,女,1992年生,博士,主要研究方向为单目视觉SLAM,VIN,移动机器人视觉导航;朴松昊,男,1972年生,教授,博士生导师,中国人工智能学会常务理事,机器人文化艺术专业委员会主任,主要研究方向为机器人环境感知与导航、机器人运动规划、多智能体机器人协作。主持或参加了国家自然科学基金、国家"863"计划重点及面上项目、机器人技术与系统国家重点实验室基金、教育部"985"项目、三星国际合作项目等多个项目。发表学术论文60余篇,其中被SCI、EI、ISTP检索60多篇,出版专著一部;李国,男,1989年生,博士,主要研究方向为SLAM、机器学习。
通讯作者:朴松昊.E-mail:piaosh@hit.edu.cn.
更新日期/Last Update: 1900-01-01