[1]张鹏鹏,魏长赟,张恺睿,等.旋翼无人机在移动平台降落的控制参数自学习调节方法[J].智能系统学报,2022,17(5):931-940.[doi:10.11992/tis.202107040]
ZHANG Pengpeng,WEI Changyun,ZHANG Kairui,et al.Self-learning approach to control parameter adjustment for quadcopter landing on a moving platform[J].CAAI Transactions on Intelligent Systems,2022,17(5):931-940.[doi:10.11992/tis.202107040]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
17
期数:
2022年第5期
页码:
931-940
栏目:
学术论文—机器学习
出版日期:
2022-09-05
- Title:
-
Self-learning approach to control parameter adjustment for quadcopter landing on a moving platform
- 作者:
-
张鹏鹏, 魏长赟, 张恺睿, 欧阳勇平
-
河海大学 机电工程学院,江苏 常州 213022
- Author(s):
-
ZHANG Pengpeng, WEI Changyun, ZHANG Kairui, OUYANG Yongping
-
College of Mechanical and Electrical Engineering, Hohai University, Changzhou 213022, China
-
- 关键词:
-
自主降落; 强化学习; 路径规划; COACH框架; 确定性策略梯度; 空地协同; 无人机; 最优控制
- Keywords:
-
autonomous landing; reinforcement learning; path planning; COACH frame; deterministic policy gradient; air-ground cooperation; UAV; optimal control
- 分类号:
-
TP273+.2
- DOI:
-
10.11992/tis.202107040
- 文献标志码:
-
2022-05-20
- 摘要:
-
无人机设备能够适应复杂地形,但由于电池容量等原因,无人机无法长时间执行任务。无人机与其他无人系统(无人车、无人船等)协同能够有效提升无人机的工作时间,完成既定任务,当无人机完成任务后,将无人机迅速稳定地降落至移动平台上是一项必要且具有挑战性的工作。针对降落问题,文中提出了基于矫正纠偏COACH(corrective advice communicated humans)方法的深度强化学习比例积分微分(proportional-integral-derivative, PID)方法,为无人机降落至移动平台提供了最优路径。首先在仿真环境中使用矫正纠偏框架对强化学习模型进行训练,然后在仿真环境和真实环境中,使用训练后的模型输出控制参数,最后利用输出参数获得无人机位置控制量。仿真结果和真实无人机实验表明,基于矫正纠偏COACH方法的深度强化学习PID方法优于传统控制方法,且能稳定完成在移动平台上的降落任务。
- Abstract:
-
Unmanned Aerial Vehicle (UAV) is a type of robot that performs well in mapping without being affected by the terrain. However, a UAV cannot perform its tasks for long due to its small battery capacity and several other reasons. The collaboration between UAVs and other unmanned ground vehicles (UGVs) is considered a crucial solution to this concern as it can save up the time taken by UAVs effectively when completing a scheduled task. When deploying a team of UAVs and UGVs, it is both important and challenging to land a UAV on a mobile platform quickly and stably. To circumvent the UAV landing issue, this study proposes a reinforcement learning PID method based on the correction COACH method, thereby providing an optimal path for the UAV to land on a mobile platform. First, the reinforcement learning agent is trained using the rectification framework in a simulated environment. Next, the trained agent is used for output control parameters in the simulated and true environments, and subsequently, the output parameters are utilized to obtain the control variables of the UAV’s position. The simulation and real UAV experiment results show that the deep reinforcement learning PID method based on the correction COACH method is superior to the traditional control method and can accomplish the task of a stable landing on a mobile platform.
更新日期/Last Update:
1900-01-01