[1]严家政,专祥涛.基于强化学习的参数自整定及优化算法[J].智能系统学报,2022,17(2):341-347.[doi:10.11992/tis.202012038]
 YAN Jiazheng,ZHUAN Xiangtao.Parameter self-tuning and optimization algorithm based on reinforcement learning[J].CAAI Transactions on Intelligent Systems,2022,17(2):341-347.[doi:10.11992/tis.202012038]
点击复制

基于强化学习的参数自整定及优化算法

参考文献/References:
[1] 赵新华, 王璞, 陈晓红. 投球机器人模糊PID控制[J]. 智能系统学报, 2015, 10(3): 399–406
ZHAO Xinhua, WANG Pu, CHEN Xiaohong. Fuzzy PID control of pitching robots[J]. CAAI transactions on intelligent systems, 2015, 10(3): 399–406
[2] YANG Bo, YU Tao, SHU Hongchun, et al. Perturbation observer based fractional-order PID control of photovoltaics inverters for solar energy harvesting via Yin-Yang-Pair optimization[J]. Energy conversion and management, 2018, 171: 170–187.
[3] JAISWAL S, CHILUKA S K, SEEPANA M M, et al. Design of fractional order PID controller using genetic algorithm optimization technique for nonlinear system[J]. Chemical product and process modeling, 2020, 15(2): 20190072.
[4] 陈增强, 黄朝阳, 孙明玮, 等. 基于大变异遗传算法进行参数优化整定的负荷频率自抗扰控制[J]. 智能系统学报, 2020, 15(1): 41–49
CHEN Zengqiang, HUANG Zhaoyang, SUN Mingwei, et al. Active disturbance rejection control of load frequency based on big probability variation’s genetic algorithm for parameter optimization[J]. CAAI transactions on intelligent systems, 2020, 15(1): 41–49
[5] WEI Wei, CHEN Nan, ZHANG Zhiyuan, et al. U-model-based active disturbance rejection control for the dissolved oxygen in a wastewater treatment process[J]. Mathematical problems in engineering, 2020: 3507910.
[6] 胡越, 罗东阳, 花奎, 等. 关于深度学习的综述与讨论[J]. 智能系统学报, 2019, 14(1): 1–19
HU Yue, LUO Dongyang, HUA Kui, et al. Review and discussion on deep learning[J]. CAAI transactions on intelligent systems, 2019, 14(1): 1–19
[7] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484–489.
[8] 李超, 张智, 夏桂华, 等. 基于强化学习的学习变阻抗控制[J]. 哈尔滨工程大学学报, 2019, 40(2): 304–311
LI Chao, ZHANG Zhi, XIA Guihua, et al. Learning variable impedance control based on reinforcement learning[J]. Journal of Harbin Engineering University, 2019, 40(2): 304–311
[9] 王念滨, 何鸣, 王红滨, 等. 适用于水下目标识别的快速降维卷积模型[J]. 哈尔滨工程大学学报, 2019, 40(7): 1327–1333
WANG Nianbin, HE Ming, WANG Hongbin, et al. Fast dimensional-reduction convolution model for underwater target recognition[J]. Journal of Harbin Engineering University, 2019, 40(7): 1327–1333
[10] 黄立威, 江碧涛, 吕守业, 等. 基于深度学习的推荐系统研究综述[J]. 计算机学报, 2018, 41(7): 1619–1647
HUANG Liwei, JIANG Bitao, LYU Shouye, et al. A review of recommendation systems based on deep learning[J]. Chinese journal of computers, 2018, 41(7): 1619–1647
[11] GHEISARNEJAD M, KHOOBAN M H. An intelligent non-integer PID controller-based deep reinforcement learning: implementation and experimental results[J]. IEEE transactions on industrial electronics, 2021, 68(4): 3609–3618.
[12] BUSONIU L, DE BRUIN T, TOLI? D, et al. Reinforcement learning for control: performance, stability, and deep approximators[J]. Annual reviews in control, 2018, 46: 8–28.
[13] 袁兆麟, 何润姿, 姚超, 等. 基于强化学习的浓密机底流浓度在线控制算法[J]. 自动化学报, 2021, 47(7): 1558–1571
YUAN Zhaolin, HE Runzi, YAO Chao, et al. Online reinforcement learning control algorithm for concentration of thickener underflow[J]. Acta automatica sinica, 2021, 47(7): 1558–1571
[14] NIAN R, LIU J, HUANG B. A review on reinforcement learning: introduction and applications in industrial process control[J]. Computers and chemical engineering, 2020: 106886.
[15] PANG B, JIANG Z P, MAREELS I. Reinforcement learning for adaptive optimal control of continuous-time linear periodic systems[J]. Automatica, 2020, 118: 109035.
[16] 殷昌盛, 杨若鹏, 朱巍, 等. 多智能体分层强化学习综述[J]. 智能系统学报, 2020, 15(4): 646–655
YIN Changsheng, YANG Ruopeng, ZHU Wei, et al. A survey on multi-agent hierarchical reinforcement learning[J]. CAAI transactions on intelligent systems, 2020, 15(4): 646–655
[17] 高瑞娟, 吴梅. 基于改进强化学习的PID参数整定原理及应用[J]. 现代电子技术, 2014, 37(4): 1–4
GAO Ruijuan, WU Mei. Principle and application of PID parameter tuning based on improved reinforcement learning[J]. Modern electronics technique, 2014, 37(4): 1–4
[18] ALDEMIR A, HAPO?LU H. Comparison of PID tuning methods for wireless temperature control[J]. Journal of polytechnic, 2016, 19(1): 9–19.
[19] 蔡聪仁, 向凤红. 基于遗传算法优化PID的板球系统位置控制[J]. 电子测量技术, 2019, 42(23): 97–101
CAI Congren, XIANG Fenghong. Position control of cricket system based on genetic algorithm optimized PID[J]. Electronic measurement technology, 2019, 42(23): 97–101
[20] 么洪飞, 王宏健, 王莹, 等. 基于遗传算法DDBN参数学习的UUV威胁评估[J]. 哈尔滨工程大学学报, 2018, 39(12): 1972–1978
YAO Hongfei, WANG Hongjian, WANG Ying, et al. UUV threat assessment based on genetic algorithm DDBN parameter learning[J]. Journal of Harbin Engineering University, 2018, 39(12): 1972–1978
[21] 胡勤丰, 陈威振, 邱攀峰, 等. 适用于连续加减速的永磁同步电机模糊增益自调整PI控制研究[J]. 中国电机工程学报, 2017, 37(3): 907–914
HU Qinfeng, CHEN Weizhen, QIU Panfeng, et al. Research on fuzzy self-tuning gain PI control for accelerating and decelerating based on permanent magnet synchronous motor[J]. Proceedings of the CSEE, 2017, 37(3): 907–914
[22] 叶政. PID控制器参数整定方法研究及其应用[D]. 北京: 北京邮电大学, 2016.
YE Zheng. Research on PID controller parameter tuning method and its application [D]. Beijing: Beijing University of Posts and Telecommunications, 2016.
[23] 刘志林, 李国胜, 张军. 有横摇约束的欠驱动船舶航迹跟踪预测控制[J]. 哈尔滨工程大学学报, 2019, 40(2): 312–317
LIU Zhilin, LI Guosheng, ZHANG Jun. Predictive control of underactuated ship track tracking with roll constraint[J]. Journal of Harbin Engineering University, 2019, 40(2): 312–317
[24] 朱芮, 吴迪, 陈继峰, 等. 电机系统模型预测控制研究综述[J]. 电机与控制应用, 2019, 46(8): 1–10,30
ZHU Rui, WU Di, CHEN Jifeng, et al. A review of model predictive control for motor systems[J]. Electric machines and control application, 2019, 46(8): 1–10,30
[25] PU Z, WANG Y, CHANG N, et al. A deep reinforcement learning framework for optimizing fuel economy of hybrid electric vehicles[C]//2018 23rd Asia and South Pacific Design Automation Conference. Jeju Island, Korea, 2018.
[26] 张法帅, 李宝安, 阮子涛. 基于深度强化学习的无人艇航行控制[J]. 计测技术, 2018, 38(A01): 5
ZHANG Fashuai, LI Baoan, RUAN Zitao. Navigation control of unmanned vehicle based on deep reinforcement learning[J]. Metrology and measurement technology, 2018, 38(A01): 5
[27] 唐振韬, 邵坤, 赵冬斌, 等. 深度强化学习进展: 从AlphaGo到AlphaGo Zero[J]. 控制理论与应用, 2017, 34(12): 18
TANG Zhentao, SHAO Kun, ZHAO Dongbin, et al. Progress in deep reinforcement learning: from AlphaGo to AlphaGo Zero[J]. Control theory and applications, 2017, 34(12): 18
相似文献/References:
[1]连传强,徐昕,吴军,等.面向资源分配问题的Q-CF多智能体强化学习[J].智能系统学报,2011,6(2):95.
 LIAN Chuanqiang,XU Xin,WU Jun,et al.Q-CF multiAgent reinforcement learningfor resource allocation problems[J].CAAI Transactions on Intelligent Systems,2011,6():95.
[2]梁爽,曹其新,王雯珊,等.基于强化学习的多定位组件自动选择方法[J].智能系统学报,2016,11(2):149.[doi:10.11992/tis.201510031]
 LIANG Shuang,CAO Qixin,WANG Wenshan,et al.An automatic switching method for multiple location components based on reinforcement learning[J].CAAI Transactions on Intelligent Systems,2016,11():149.[doi:10.11992/tis.201510031]
[3]张文旭,马磊,王晓东.基于事件驱动的多智能体强化学习研究[J].智能系统学报,2017,12(1):82.[doi:10.11992/tis.201604008]
 ZHANG Wenxu,MA Lei,WANG Xiaodong.Reinforcement learning for event-triggered multi-agent systems[J].CAAI Transactions on Intelligent Systems,2017,12():82.[doi:10.11992/tis.201604008]
[4]周文吉,俞扬.分层强化学习综述[J].智能系统学报,2017,12(5):590.[doi:10.11992/tis.201706031]
 ZHOU Wenji,YU Yang.Summarize of hierarchical reinforcement learning[J].CAAI Transactions on Intelligent Systems,2017,12():590.[doi:10.11992/tis.201706031]
[5]张文旭,马磊,贺荟霖,等.强化学习的地-空异构多智能体协作覆盖研究[J].智能系统学报,2018,13(2):202.[doi:10.11992/tis.201609017]
 ZHANG Wenxu,MA Lei,HE Huilin,et al.Air-ground heterogeneous coordination for multi-agent coverage based on reinforced learning[J].CAAI Transactions on Intelligent Systems,2018,13():202.[doi:10.11992/tis.201609017]
[6]徐鹏,谢广明,文家燕,等.事件驱动的强化学习多智能体编队控制[J].智能系统学报,2019,14(1):93.[doi:10.11992/tis.201807010]
 XU Peng,XIE Guangming,WEN Jiayan,et al.Event-triggered reinforcement learning formation control for multi-agent[J].CAAI Transactions on Intelligent Systems,2019,14():93.[doi:10.11992/tis.201807010]
[7]郭宪,方勇纯.仿生机器人运动步态控制:强化学习方法综述[J].智能系统学报,2020,15(1):152.[doi:10.11992/tis.201907052]
 GUO Xian,FANG Yongchun.Locomotion gait control for bionic robots: a review of reinforcement learning methods[J].CAAI Transactions on Intelligent Systems,2020,15():152.[doi:10.11992/tis.201907052]
[8]申翔翔,侯新文,尹传环.深度强化学习中状态注意力机制的研究[J].智能系统学报,2020,15(2):317.[doi:10.11992/tis.201809033]
 SHEN Xiangxiang,HOU Xinwen,YIN Chuanhuan.State attention in deep reinforcement learning[J].CAAI Transactions on Intelligent Systems,2020,15():317.[doi:10.11992/tis.201809033]
[9]殷昌盛,杨若鹏,朱巍,等.多智能体分层强化学习综述[J].智能系统学报,2020,15(4):646.[doi:10.11992/tis.201909027]
 YIN Changsheng,YANG Ruopeng,ZHU Wei,et al.A survey on multi-agent hierarchical reinforcement learning[J].CAAI Transactions on Intelligent Systems,2020,15():646.[doi:10.11992/tis.201909027]
[10]莫宏伟,田朋.基于注意力融合的图像描述生成方法[J].智能系统学报,2020,15(4):740.[doi:10.11992/tis.201910039]
 MO Hongwei,TIAN Peng.An image caption generation method based on attention fusion[J].CAAI Transactions on Intelligent Systems,2020,15():740.[doi:10.11992/tis.201910039]

备注/Memo

收稿日期:2020-12-23。
基金项目:深圳市知识创新计划项目(JCYJ20170818144449801)
作者简介:严家政,硕士研究生,主要研究方向为深度强化学习、最优控制;专祥涛,教授,博士生导师,IEEE会员,湖北省自动化学会常务理事,主要研究方向为载体运动过程建模与控制、新能源系统规划与运行、资源优化分配、智能控制与数据分析。发表学术论文30余篇
通讯作者:专祥涛.E-mail:xtzhuan@whu.edu.cn

更新日期/Last Update: 1900-01-01
Copyright @ 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134