[1]白涛,董勤浩,冯梓昆,等.基于强化学习的水下高速航行体纵向运动控制研究[J].智能系统学报,2023,18(5):902-916.[doi:10.11992/tis.202203024]
 BAI Tao,DONG Qinhao,FENG Zikun,et al.Longitudinal motion control of underwater high-speed vehicles based on reinforcement learning[J].CAAI Transactions on Intelligent Systems,2023,18(5):902-916.[doi:10.11992/tis.202203024]
点击复制

基于强化学习的水下高速航行体纵向运动控制研究

参考文献/References:
[1] MAO Xiaofeng, WANG Qian. Adaptive control design for a supercavitating vehicle model based on fin force parameter estimation[J]. Journal of vibration and control, 2015, 21(6): 1220-1233.
[2] 赵新华, 孙尧, 安伟光, 等. 超空泡航行体控制问题研究进展[J]. 力学进展, 2009, 39(5): 537-545
ZHAO Xinhua, SUN Yao, AN Weiguang, et al. Advances in supercavitating vehicle control technology[J]. Advances in mechanics, 2009, 39(5): 537-545
[3] DZIELSKI J, KURDILA A. A benchmark control problem for supercavitating vehicles and an initial investigation of solutions[J]. Journal of vibration and control, 2003, 9(7): 791-804.
[4] 陈超倩, 曹伟, 王聪, 等. 超空泡航行体最优控制建模与仿真[J]. 北京理工大学学报, 2016, 36(10): 1031-1036
CHEN Chaoqian, CAO Wei, WANG Cong, et al. Modeling and simulating of supercavitating vehicles based on optimal control[J]. Transactions of Beijing Institute of Technology, 2016, 36(10): 1031-1036
[5] 庞爱平, 何朕, 王京华, 等. 超空泡航行体H∞状态反馈设计[J]. 控制理论与应用, 2018, 35(2): 146-152
PANG Aiping, HE Zhen, WANG Jinghua, et al. H-infinity state feedback design for supercavitating vehicles[J]. Control theory & applications, 2018, 35(2): 146-152
[6] 韩云涛, 强宝琛, 孙尧, 等. 基于LPV的超空泡航行体H∞抗饱和控制[J]. 系统工程与电子技术, 2016, 38(2): 357-361
HAN Yuntao, QIANG Baochen, SUN Yao, et al. H∞ anti-windup control for a supercavitating vehicle based on LPV[J]. Systems engineering and electronics, 2016, 38(2): 357-361
[7] 李洋, 刘明雍, 张小件. 基于自适应RBF神经网络的超空泡航行体反演控制[J]. 自动化学报, 2020, 46(4): 734-743
LI Yang, LIU Mingyong, ZHANG Xiaojian. Adaptive RBF neural network based backsteppting control for supercavitating vehicles[J]. Acta automatica sinica, 2020, 46(4): 734-743
[8] KIRSCHNER I N, KRING D C, STOKES A W, et al. Control strategies for supercavitating vehicles[J]. Journal of vibration and control, 2002, 8(2): 219-242.
[9] ZHAO Xinhua, ZHANG Xiaoyu, YE Xiufen, et al. Sliding mode controller design for supercavitating vehicles[J]. Ocean engineering, 2019, 184: 173-183.
[10] 范辉, 张宇文. 超空泡航行器稳定性分析及其非线性切换控制[J]. 控制理论与应用, 2009, 26(11): 1211-1217
FAN Hui, ZHANG Yuwen. Stability analysis and nonlinear switching controller design for supercavitating vehicles[J]. Control theory & applications, 2009, 26(11): 1211-1217
[11] 池海红, 于馥睿, 郭泽会. 基于强化学习的高速飞行器巡航段高度控制[J]. 哈尔滨工程大学学报, 2021, 42(9): 1340-1346,1362
CHI Haihong, YU Furui, GUO Zehui. Altitude control for high-speed vehicles in the cruise phase based on reinforcement learning[J]. Journal of Harbin Engineering University, 2021, 42(9): 1340-1346,1362
[12] MU Chaoxu, NI Zhen, SUN Changyin, et al. Air-breathing hypersonic vehicle tracking control based on adaptive dynamic programming[J]. IEEE transactions on neural networks and learning systems, 2017, 28(3): 584-598.
[13] 许雅筑, 武辉, 游科友, 等. 强化学习方法在自主水下机器人控制任务中的应用[J]. 中国科学:信息科学, 2020, 50(12): 1798-1816
HSU Yachu, WU Hui, YOU Keyou, et al. A selected review of reinforcement learning-based control for autonomous underwater vehicles[J]. Scientia sinica (informationis), 2020, 50(12): 1798-1816
[14] HAFNER R, RIEDMILLER M. Reinforcement learning in feedback control[J]. Machine learning, 2011, 84(1): 137-169.
[15] 王日中, 李慧平, 崔迪, 等. 基于深度强化学习算法的自主式水下航行器深度控制[J]. 智能科学与技术学报, 2020, 2(4): 354-360
WANG Rizhong, LI Huiping, CUI Di, et al. Depth control of autonomous underwater vehicle using deep reinforcement learning[J]. Chinese journal of intelligent science and technology, 2020, 2(4): 354-360
[16] LOGVINOVICH G V. Some Problems of supercavitating flows[C]// Proceedings of NATO-AGARD, [S. l. : s. n. ], 1997: 36?44.
[17] 刘全, 翟建伟, 章宗长, 等. 深度强化学习综述[J]. 计算机学报, 2018, 41(1): 1-27
LIU Quan, ZHAI JianWei, ZHANG Zongzhang, et al. A survey on deep reinforcement learning[J]. Chinese journal of computers, 2018, 41(1): 1-27
[18] 袁兆麟, 何润姿, 姚超, 等. 基于强化学习的浓密机底流浓度在线控制算法[J]. 自动化学报, 2021, 47(7): 1558-1571
YUAN Zhaolin, HE Runzi, YAO Chao, et al. Online reinforcement learning control algorithm for concentration of thickener underflow[J]. Acta automatica sinica, 2021, 47(7): 1558-1571
[19] 严家政, 专祥涛. 基于强化学习的参数自整定及优化算法[J]. 智能系统学报, 2022, 17(2): 341-347
YAN Jiazheng, ZHUAN Xiangtao. Parameter self-tuning and optimization algorithm based on reinforcement learning[J]. CAAI transactions on intelligent systems, 2022, 17(2): 341-347
[20] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533.
[21] SUTTON R S, MCALLESTER D, SINGH S, et al. Policy gradient methods for reinforcement learning with function approximation[C]//Proceedings of the 12th International Conference on Neural Information Processing Systems. New York: ACM, 1999: 1057?1063.
[22] HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor[EB/OL]. (2018?01?04)[2021?01?01]. https://arxiv.org/abs/1801.01290.
[23] SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//Proceedings of the 31st International Conference on International Conference on Machine Learning-Volume 32. New York: ACM, 2014: I?387.
[24] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[EB/OL]. (2015?09?09)[2021?01?01]. https://arxiv.org/abs/1509.02971
相似文献/References:
[1]徐玉如,庞永杰,甘?? 永,等.智能水下机器人技术展望[J].智能系统学报,2006,1(1):9.
 XU Yu-ru,PANG Yong-jie,GAN Yong,et al.AUV—state-of-the-art and prospect[J].CAAI Transactions on Intelligent Systems,2006,1():9.
[2]连传强,徐昕,吴军,等.面向资源分配问题的Q-CF多智能体强化学习[J].智能系统学报,2011,6(2):95.
 LIAN Chuanqiang,XU Xin,WU Jun,et al.Q-CF multiAgent reinforcement learningfor resource allocation problems[J].CAAI Transactions on Intelligent Systems,2011,6():95.
[3]孙宁,方勇纯.一类欠驱动系统的控制方法综述[J].智能系统学报,2011,6(3):200.
 SUN Ning,FANG Yongchun.A review for the control of a class of underactuated systems[J].CAAI Transactions on Intelligent Systems,2011,6():200.
[4]乔俊飞,逄泽芳,韩红桂.基于改进粒子群算法的污水处理过程神经网络优化控制[J].智能系统学报,2012,7(5):429.
 QIAO Junfei,PANG Zefang,HAN Honggui.Neural network optimal control for wastewater treatment processbased on APSO[J].CAAI Transactions on Intelligent Systems,2012,7():429.
[5]曾华琳,黄雨轩,晁飞,等.书写机器人研究综述[J].智能系统学报,2016,11(1):15.[doi:10.11992/tis.201507067]
 ZENG Hualin,HUANG Yuxuan,CHAO Fei,et al.Survey of robotic calligraphy research[J].CAAI Transactions on Intelligent Systems,2016,11():15.[doi:10.11992/tis.201507067]
[6]梁爽,曹其新,王雯珊,等.基于强化学习的多定位组件自动选择方法[J].智能系统学报,2016,11(2):149.[doi:10.11992/tis.201510031]
 LIANG Shuang,CAO Qixin,WANG Wenshan,et al.An automatic switching method for multiple location components based on reinforcement learning[J].CAAI Transactions on Intelligent Systems,2016,11():149.[doi:10.11992/tis.201510031]
[7]黄心汉.A3I:21世纪科技之光[J].智能系统学报,2016,11(6):835.[doi:10.11992/tis.201605022]
 HUANG Xinhan.A3I: the star of science and technology for the 21st century[J].CAAI Transactions on Intelligent Systems,2016,11():835.[doi:10.11992/tis.201605022]
[8]张文旭,马磊,王晓东.基于事件驱动的多智能体强化学习研究[J].智能系统学报,2017,12(1):82.[doi:10.11992/tis.201604008]
 ZHANG Wenxu,MA Lei,WANG Xiaodong.Reinforcement learning for event-triggered multi-agent systems[J].CAAI Transactions on Intelligent Systems,2017,12():82.[doi:10.11992/tis.201604008]
[9]周文吉,俞扬.分层强化学习综述[J].智能系统学报,2017,12(5):590.[doi:10.11992/tis.201706031]
 ZHOU Wenji,YU Yang.Summarize of hierarchical reinforcement learning[J].CAAI Transactions on Intelligent Systems,2017,12():590.[doi:10.11992/tis.201706031]
[10]刘经纬,赵辉,周瑞,等.规则推理与神经计算智能控制系统改进及比较[J].智能系统学报,2017,12(6):823.[doi:10.11992/tis.201602015]
 LIU Jingwei,ZHAO Hui,ZHOU Rui,et al.Improvement and comparison research between intelligent control systems based on rule based reasoning and neural computation AI methods[J].CAAI Transactions on Intelligent Systems,2017,12():823.[doi:10.11992/tis.201602015]

备注/Memo

收稿日期:2022-3-24。
基金项目:黑龙江省自然科学基金项目(LH2021E043).
作者简介:白涛,副教授,主要研究方向为水下高速航行体的导航和运动控制。主持国家自然科学基金青年项目、黑龙江省自然科学基金等项目,发表学术论文10余篇,出版专著1部;董勤浩,硕士研究生,主要研究方向为水下高速航行体的运动控制;冯梓昆,硕士研究生,主要研究方向为水下高速航行体的运动控制
通讯作者:白涛.E-mail:baitao1@hrbeu.edu.cn

更新日期/Last Update: 1900-01-01
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com