[1]白涛,董勤浩,冯梓昆,等.基于强化学习的水下高速航行体纵向运动控制研究[J].智能系统学报,2023,18(5):902-916.[doi:10.11992/tis.202203024]
BAI Tao,DONG Qinhao,FENG Zikun,et al.Longitudinal motion control of underwater high-speed vehicles based on reinforcement learning[J].CAAI Transactions on Intelligent Systems,2023,18(5):902-916.[doi:10.11992/tis.202203024]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
18
期数:
2023年第5期
页码:
902-916
栏目:
学术论文—机器学习
出版日期:
2023-09-05
- Title:
-
Longitudinal motion control of underwater high-speed vehicles based on reinforcement learning
- 作者:
-
白涛, 董勤浩, 冯梓昆, 李雪华
-
哈尔滨工程大学 智能科学与工程学院, 黑龙江 哈尔滨 150001
- Author(s):
-
BAI Tao, DONG Qinhao, FENG Zikun, LI Xuehua
-
College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
-
- 关键词:
-
智能控制; 强化学习; 深度确定性策略梯度算法; 水下高速航行体; 非线性系统; 纵向稳定控制; 执行器饱和; 下潜
- Keywords:
-
intelligent control; reinforcement learning; deep deterministic policy gradient (DDPG) algorithm; underwater high-speed vehicle; nonlinear system; longitudinal stability control; actuator saturation; diving
- 分类号:
-
TP15
- DOI:
-
10.11992/tis.202203024
- 摘要:
-
水下高速航行体由于空泡特性导致其数学模型存在强非线性和强不确定性,经典控制方法如线性二次型调节控制(linear quadratic regulator, LQR)、切换控制等很难实现有效控制。针对水下高速航行体模型难以准确解耦或线性化处理;经典控制方法难以充分考虑水下环境复杂多变性以及在应对扰动时控制器可能会出现过饱和现象的问题,采用智能控制中的强化学习算法,使用在不基于准确模型的条件下与环境不断探索与交互得到控制策略的策略,完成了深度确定性策略梯度(deep deterministic policy gradient,DDPG)智能体控制器的设计。实验结果证明,设计的控制器能够保证水下高速航行体纵向运动的稳定控制,在执行器不超过饱和范围内能够应对扰动并完成下潜控制任务,具有较强的鲁棒性和更好的适应性。
- Abstract:
-
Owing to cavitation characteristics, the mathematical model of a high-speed underwater vehicle has strong nonlinearity and uncertainty. Classical methods such as the linear quadratic regulator and switching control cannot achieve effective control. To address problems in the difficulty of decoupling or linearizing the underwater high-speed vehicle model accurately, the classical control method cannot fully consider the complexity and variability of the underwater environment, and the controller may be oversaturated when dealing with disturbances. Thus, the reinforcement learning algorithm in intelligent control was adopted in this study. It continuously explores and interacts with the environment to obtain control despite the absence of an accurate model and thereby completing the design of the deep deterministic policy gradient agent controller. The experimental results show that the designed controller can ensure stable control of the longitudinal motion of the high-speed underwater vehicle. Within the saturation range of the actuator, it can respond to disturbance and complete the diving control task, and the controller has strong robustness and better adaptability.
更新日期/Last Update:
1900-01-01