[1]李康斌,朱齐丹,牟进友,等.基于改进DDQN船舶自动靠泊路径规划方法[J].智能系统学报,2025,20(1):73-80.[doi:10.11992/tis.202401005]
LI Kangbin,ZHU Qidan,MU Jinyou,et al.Automatic ship berthing path-planning method based on improved DDQN[J].CAAI Transactions on Intelligent Systems,2025,20(1):73-80.[doi:10.11992/tis.202401005]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
20
期数:
2025年第1期
页码:
73-80
栏目:
学术论文—机器学习
出版日期:
2025-01-05
- Title:
-
Automatic ship berthing path-planning method based on improved DDQN
- 作者:
-
李康斌, 朱齐丹, 牟进友, 菅紫婷
-
哈尔滨工程大学 智能科学与工程学院, 黑龙江 哈尔滨 150001
- Author(s):
-
LI Kangbin, ZHU Qidan, MU Jinyou, JIAN Ziting
-
School of Intelligent Science and Engineering, Harbin Engineering University, Harbin 150001, China
-
- 关键词:
-
自动靠泊; 路径规划; 深度强化学习; 双深度Q网络; 奖励函数; 水流速度; 状态探索; 推力; 时间; 独立重复实验
- Keywords:
-
automatic berthing; path planning; deep reinforcement learning; double deep Q network; reward function; current velocity; state exploration; thrust; time; independent repeated experiments
- 分类号:
-
TP273
- DOI:
-
10.11992/tis.202401005
- 摘要:
-
船舶在自动靠泊过程中会受到风、浪、流和岸壁效应等因素的影响,故需要精确的路径规划方法防止靠泊失败。针对全驱动船舶靠泊过程的基于双深度Q网络(double deep Q network,DDQN)算法,设计了一种船舶自动靠泊路径规划方法。首先建立船舶三自由度模型,然后通过将距离、航向、推力、时间和碰撞作为奖励或惩罚,改进奖励函数。随后引入DDQN来学习动作奖励模型,并使用学习结果来操纵船舶运动。通过追求更高的奖励值,船舶可以自行找到最优的靠泊路径。实验结果表明,在不同水流速度下,船舶都可以在完成靠泊的同时减小时间和推力,并且在相同水流速度下,DDQN算法与Q-learning、SARSA(state action reward state action)、深度Q网络(deep Q network, DQN)等算法相比,靠泊过程推力分别减小了241.940、234.614、80.202 N,且时间仅为252.485 s。
- Abstract:
-
During the automatic docking process, ships are influenced by factors such as wind, waves, currents, and quay wall effects, necessitating precise path-planning methods to prevent docking failures. For fully actuated ships, a ship’s automatic docking path-planning method is designed based on the double deep Q network (DDQN) algorithm. Firstly, a three-degree-of-freedom model for the ship is established, and then the reward function is improved by incorporating distance, heading, thrust, time, and collisions as rewards or penalties. DDQN is then introduced to learn the action-reward model and use the learning results to manipulate ship movements. By pursuing higher reward values, the ship can autonomously find the optimal docking path. Experimental results show that under different water flow velocities, ships can reduce both time and thrust while completing docking. Moreover, at the same water flow velocity, compared with Q-learning, SARSA, and deep Q Network (DQN), the DDQN algorithm reduces thrust by 241.940N, 234.614N, and 80.202N respectively during the docking process, with the time being only 252.485 seconds.
备注/Memo
收稿日期:2024-1-3。
基金项目:国家自然科学基金项目(52171299).
作者简介:李康斌,硕士研究生,主要研究方向为船舶自动靠泊和路径规划。E-mail:422698152@qq.com。;朱齐丹,教授,博士生导师,黑龙江省自动化学会常务理事,主要研究方向为智能机器人技术及应用、智能控制系统设计、图像处理与模式识别。主持国家自然科学基金项目、工信部高技术船舶专项项目、科技部国际合作项目等课题30项。研究成果获国家科技进步二等奖1项、国防科技进步一等奖3项、军队科技进步一等奖1项、黑龙江省科技进步二等奖3项,获发明专利授权20项、软件著作授权5项,出版专著4部,发表学术论文200篇。E-mail:zhuqidan@hrbeu.edu.cn。;牟进友,博士研究生,主要研究方向为船舶自主靠泊和智能船舶。E-mail: mujinyou96@163.com。
通讯作者:朱齐丹. E-mail:zhuqidan@hrbeu.edu.cn
更新日期/Last Update:
2025-01-05