[1]ZHANG Pengpeng,WEI Changyun,ZHANG Kairui,et al.Self-learning approach to control parameter adjustment for quadcopter landing on a moving platform[J].CAAI Transactions on Intelligent Systems,2022,17(5):931-940.[doi:10.11992/tis.202107040]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
17
Number of periods:
2022 5
Page number:
931-940
Column:
学术论文—机器学习
Public date:
2022-09-05
- Title:
-
Self-learning approach to control parameter adjustment for quadcopter landing on a moving platform
- Author(s):
-
ZHANG Pengpeng; WEI Changyun; ZHANG Kairui; OUYANG Yongping
-
College of Mechanical and Electrical Engineering, Hohai University, Changzhou 213022, China
-
- Keywords:
-
autonomous landing; reinforcement learning; path planning; COACH frame; deterministic policy gradient; air-ground cooperation; UAV; optimal control
- CLC:
-
TP273+.2
- DOI:
-
10.11992/tis.202107040
- Abstract:
-
Unmanned Aerial Vehicle (UAV) is a type of robot that performs well in mapping without being affected by the terrain. However, a UAV cannot perform its tasks for long due to its small battery capacity and several other reasons. The collaboration between UAVs and other unmanned ground vehicles (UGVs) is considered a crucial solution to this concern as it can save up the time taken by UAVs effectively when completing a scheduled task. When deploying a team of UAVs and UGVs, it is both important and challenging to land a UAV on a mobile platform quickly and stably. To circumvent the UAV landing issue, this study proposes a reinforcement learning PID method based on the correction COACH method, thereby providing an optimal path for the UAV to land on a mobile platform. First, the reinforcement learning agent is trained using the rectification framework in a simulated environment. Next, the trained agent is used for output control parameters in the simulated and true environments, and subsequently, the output parameters are utilized to obtain the control variables of the UAV’s position. The simulation and real UAV experiment results show that the deep reinforcement learning PID method based on the correction COACH method is superior to the traditional control method and can accomplish the task of a stable landing on a mobile platform.