[1]朱少凯,孟庆浩,金晟,等.基于深度强化学习的室内视觉局部路径规划[J].智能系统学报,2022,17(5):908-918.[doi:10.11992/tis.202107059]
ZHU Shaokai,MENG Qinghao,JIN Sheng,et al.Indoor visual local path planning based on deep reinforcement learning[J].CAAI Transactions on Intelligent Systems,2022,17(5):908-918.[doi:10.11992/tis.202107059]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
17
期数:
2022年第5期
页码:
908-918
栏目:
学术论文—机器学习
出版日期:
2022-09-05
- Title:
-
Indoor visual local path planning based on deep reinforcement learning
- 作者:
-
朱少凯, 孟庆浩, 金晟, 戴旭阳
-
天津大学 电气自动化与信息工程学院 机器人与自主系统研究所,天津 300072
- Author(s):
-
ZHU Shaokai, MENG Qinghao, JIN Sheng, DAI Xuyang
-
Institute of Robotics and Autonomous Systems, School of Electrical and Information Engineering, Tianjin University, 300072, China
-
- 关键词:
-
视觉导航; 深度学习; 强化学习; 局部路径规划; 避障; 视觉SLAM; 近端策略优化; 移动机器人
- Keywords:
-
visual navigation; deep learning; reinforcement learning; local path planning; obstacle avoidance; visual SLAM; proximal policy optimization (PPO); mobile robot
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202107059
- 文献标志码:
-
2022-06-24
- 摘要:
-
传统的机器人局部路径规划方法多为已有先验地图的情况设计,导致其在与视觉(simultaneous localization and mapping, SLAM)结合的导航中效果不佳。为此传统的机器人局部路径规划方法多为已有先验地图的情况设计,导致其在与视觉SLAM结合的导航中效果不佳。为此,本文提出一种基于深度强化学习的视觉局部路径规划策略。首先,基于视觉同时定位与建图(SLAM)技术建立周围环境的栅格地图,并使用A*算法规划全局路径;其次,综合考虑避障、机器人行走效率、位姿跟踪等问题,构建基于深度强化学习的局部路径规划策略,设计以前进、左转、右转为基本元素的离散动作空间,以及基于彩色图、深度图、特征点图等视觉观测的状态空间,利用近端策略优化(proximal policy optimization, PPO)算法学习和探索最佳状态动作映射网络。Habitat仿真平台运行结果表明,所提出的局部路径规划策略能够在实时创建的地图上规划出一条最优或次优路径。相比于传统的局部路径规划算法,平均成功率提高了53.9%,位姿跟踪丢失率减小了66.5%,碰撞率减小了30.1%。
- Abstract:
-
Traditional robot local path planning methods are mostly designed for situations with prior maps, thus leading to poor results in navigation when combined with visual simultaneous localization and mapping (SLAM). Therefore, this paper proposes a visual local path planning strategy based on deep reinforcement learning. First, a grid map of the surrounding environment is built based on the visual SLAM technology, and the global path is planned using the A* algorithm. Second, considering the problems of obstacle avoidance, robot walking efficiency, and pose tracking, a local path planning strategy is constructed based on deep reinforcement learning to design the discrete action space with forward movement, left turn, and right turn as the basic elements, as well as the state space based on visual observation maps, such as color, depth, and feature point maps. The proximal policy optimization (PPO) algorithm is used to learn and explore the best state–action mapping network. The running results of the habitat simulation platform show that the proposed local path planning strategy can design an optimal or sub-optimal path on a map generated in real time. Compared with traditional local path planning algorithms, the average success rate of the proposed strategy is increased by 53.9%, and the average tracking failure rate and collision rate are reduced by 66.5% and 30.1%, respectively.
备注/Memo
收稿日期:2021-07-27。
基金项目:中国博士后科学基金项目(2021M692390);天津市自然科学基金项目(20JCZDJC00150, 20JCYBJC00320).
作者简介:朱少凯,硕士研究生,主要研究方向为基于视觉的同时定位与建图、机器人视觉导航;孟庆浩,教授,博士生导师,主要研究方向为机器人感知、导航与控制。完成科研项目10余项。发表学术论文百余篇;金晟,博士研究生,主要研究方向为机器人视觉导航、深度强化学习
通讯作者:金晟. E-mail:shengjin@tju.edu.cn
更新日期/Last Update:
1900-01-01