[1]张含笑,邢向磊.融合深度学习与神经隐式表征的视觉SLAM系统[J].智能系统学报,2026,21(1):120-131.[doi:10.11992/tis.202505029]
 ZHANG Hanxiao,XING Xianglei.Deep-learning-enhanced visual SLAM with neural implicit scene representation[J].CAAI Transactions on Intelligent Systems,2026,21(1):120-131.[doi:10.11992/tis.202505029]
点击复制

融合深度学习与神经隐式表征的视觉SLAM系统

参考文献/References:
[1] 黄泽霞, 邵春莉. 深度学习下的视觉SLAM综述[J]. 机器人, 2023, 45(6): 756-768 HUANG Zexia, SHAO Chunli. A survey of visual SLAM under deep learning[J]. Robot, 2023, 45(6): 756-768
[2] DETONE D, MALISIEWICZ T, RABINOVICH A. SuperPoint: self-supervised interest point detection and description[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City: IEEE, 2018.
[3] LUO Zixin, SHEN Tianwei, ZHOU Lei, et al. GeoDesc: learning local descriptors by integrating geometry constraints[C]//European Conference on Computer Vision. Munich: ECVA, 2018.
[4] SARLIN P E, DETONE D, MALISIEWICZ T, et al. SuperGlue: learning feature matching with graph neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020.
[5] RANFTL R, KOLTUN V. Deep fundamental matrix estimation[C]//European Conference on Computer Vision. Munich: ECVA, 2018.
[6] VON STUMBERG L, WENZEL P, YANG Nan, et al. LM-reloc: levenberg-marquardt based direct visual relocalization[C]//2020 International Conference on 3D Vision. Fukuoka: IEEE, 2020.
[7] SARLIN P E, UNAGAR A, LARSSON M, et al. Back to the feature: learning robust camera localization from pixels to pose[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021.
[8] MCCORMAC J, HANDA A, DAVISON A, et al. SemanticFusion: dense 3D semantic mapping with convolutional neural networks[C]//2017 IEEE International Conference on Robotics and Automation. Singapore: IEEE, 2017.
[9] YU Chao, LIU Zuxin, LIU Xinjun, et al. DS-SLAM: a semantic visual SLAM towards dynamic environments[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid: IEEE, 2018.
[10] TATENO K, TOMBARI F, LAINA I, et al. CNN-SLAM: real-time dense monocular SLAM with learned depth prediction[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017.
[11] ZHOU Huizhong, UMMENHOFER B, BROX T. DeepTAM: deep tracking and mapping[C]//European Conference on Computer Vision. Munich: ECVA, 2018.
[12] BLOESCH M, CZARNOWSKI J, CLARK R, et al. CodeSLAM-learning a compact, optimisable representation for dense visual SLAM[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018.
[13] CZARNOWSKI J, LAIDLOW T, CLARK R, et al. DeepFactors: real-time probabilistic dense monocular SLAM[J]. IEEE robotics and automation letters, 2020, 5(2): 721-728
[14] TEED Z, DENG J. Droid-slam: Deep visual slam for monocular, stereo, and RGB-D cameras[C]//Proceedings of the 38th Annual Conference on Neural Information Processing Systems. Vancouver: NeurIPS, 2021.
[15] TEED Z, DENG Jia. RAFT: recurrent all-pairs field transforms for optical flow[C]//European Conference on Computer Vision. ONLINE: ECVA, 2020.
[16] MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[C]//European Conference on Computer Vision. online: ECVA, 2020.
[17] SUCAR E, LIU Shikun, ORTIZ J, et al. iMAP: implicit mapping and positioning in real-time[C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021.
[18] KONG Xin, LIU Shikun, TAHER M, et al. vMAP: vectorised object mapping for neural field SLAM[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023.
[19] ZHU Zihan, PENG Songyou, LARSSON V, et al. NICE-SLAM: neural implicit scalable encoding for SLAM[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022.
[20] WANG Hengyi, WANG Jingwen, AGAPITO L. Co-SLAM: joint coordinate and sparse parametric encodings for neural real-time SLAM[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023.
[21] CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder–decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: USAACL, 2014.
[22] M?LLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[J]. ACM transactions on graphics, 2022, 41(4): 1-15
[23] RADFORD A, KIM W J, HALLACY C, et al. Learning transferable visual models from natural language supervision[EB/OL]. (2021-02-26)[2025-04-20]. https://arxiv.org/abs/2103.00020.
[24] BURRI M, NIKOLIC J, GOHL P, et al. The EuRoC micro aerial vehicle datasets[J]. The international journal of robotics research, 2016, 35(10): 1157-1163
[25] STRAUB J, WHELAN T, MA L N, et al. The replica dataset: a digital replica of indoor space[EB/OL]. (2019-06-13)[2025-04-20]. https://arxiv.org/abs/1906.05797.
[26] FORSTER C, PIZZOLI M, SCARAMUZZA D. SVO: fast semi-direct monocular visual odometry[C]//2014 IEEE International Conference on Robotics and Automation. Hong Kong: IEEE, 2014.
[27] MUR-ARTAL R, TARD?S J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE transactions on robotics, 2017, 33(5): 1255-1262
[28] CAMPOS C, ELVIRA R, RODR?GUEZ J J G, et al. ORB-SLAM3: an accurate open-source library for visual, visual–inertial, and multimap SLAM[J]. IEEE transactions on robotics, 2021, 37(6): 1874-1890
[29] SCH?NBERGER J L, FRAHM J M. Structure-from-motion revisited[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016.
[30] ZHU Zihan, PENG Songyou, LARSSON V, et al. NICER-SLAM: neural implicit scene encoding for RGB SLAM[C]//2024 International Conference on 3D Vision. Davos: IEEE, 2024.
[31] YANG Xingrui, LI Hai, ZHAI Hongjia, et al. Vox-fusion: dense tracking and mapping with voxel-based neural implicit representation[C]//2022 IEEE International Symposium on Mixed and Augmented Reality. Singapore: IEEE, 2022.
相似文献/References:
[1]杨慧,张婷,金晟,等.基于二进制生成对抗网络的视觉回环检测研究[J].智能系统学报,2021,16(4):673.[doi:10.11992/tis.202007007]
 YANG Hui,ZHANG Ting,JIN Sheng,et al.Visual loop closure detection based on binary generative adversarial network[J].CAAI Transactions on Intelligent Systems,2021,16():673.[doi:10.11992/tis.202007007]
[2]朱少凯,孟庆浩,金晟,等.基于深度强化学习的室内视觉局部路径规划[J].智能系统学报,2022,17(5):908.[doi:10.11992/tis.202107059]
 ZHU Shaokai,MENG Qinghao,JIN Sheng,et al.Indoor visual local path planning based on deep reinforcement learning[J].CAAI Transactions on Intelligent Systems,2022,17():908.[doi:10.11992/tis.202107059]
[3]殷泽众,郭茂祖,田乐.基于傅里叶频域截断的神经辐射场优化[J].智能系统学报,2024,19(5):1319.[doi:10.11992/tis.202401036]
 YIN Zezhong,GUO Maozu,TIAN Le.Neural radiance field optimization based on Fourier frequency domain truncation[J].CAAI Transactions on Intelligent Systems,2024,19():1319.[doi:10.11992/tis.202401036]

备注/Memo

收稿日期:2025-5-28。
基金项目:国家自然科学基金项目(62076078, 61703119);中央高校基本科研业务费项目(3072024LJ0403).
作者简介:张含笑,硕士,主要研究方向为计算机视觉。E-mail:2682706067@qq.com。;邢向磊,教授,博士生导师,主要研究方向为模式识别与计算机视觉。获得黑龙江省高校科学技术奖(自然科学类)一等奖,获《智能系统学报》优秀论文奖。发表学术论文 60 余篇。E-mail:xingxl@hrbeu.edu.cn。
通讯作者:邢向磊. E-mail:xingxl@hrbeu.edu.cn

更新日期/Last Update: 2026-01-05
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com