[1]徐鹏,谢广明,文家燕,等.事件驱动的强化学习多智能体编队控制[J].智能系统学报,2019,14(1):93-98.[doi:10.11992/tis.201807010]
XU Peng,XIE Guangming,WEN Jiayan,et al.Event-triggered reinforcement learning formation control for multi-agent[J].CAAI Transactions on Intelligent Systems,2019,14(1):93-98.[doi:10.11992/tis.201807010]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
14
期数:
2019年第1期
页码:
93-98
栏目:
学术论文—机器学习
出版日期:
2019-01-05
- Title:
-
Event-triggered reinforcement learning formation control for multi-agent
- 作者:
-
徐鹏1, 谢广明1,2,3, 文家燕1,2, 高远1
-
1. 广西科技大学 电气与信息工程学院, 广西 柳州 545006;
2. 北京大学 工学院, 北京 100871;
3. 北京大学 海洋研究院, 北京 100871
- Author(s):
-
XU Peng1, XIE Guangming1,2,3, WEN Jiayan1,2, GAO Yuan1
-
1. School of Electric and Information Engineering, Guangxi University of Science and Technology, Liuzhou 545006, China;
2. College of Engineering, Peking University, Beijing 100871, China;
3. Institute of Ocean Research, Peking University, Beijing 100871, China
-
- 关键词:
-
强化学习; 多智能体; 事件驱动; 编队控制; 马尔可夫过程; 集群智能; 动作决策; 粒子群算法
- Keywords:
-
reinforcement learning; multi-agent; event-triggered; formation control; Markov decision processes; swarm intelligence; action-decisions; particle swarm optimization
- 分类号:
-
TP391.8
- DOI:
-
10.11992/tis.201807010
- 摘要:
-
针对经典强化学习的多智能体编队存在通信和计算资源消耗大的问题,本文引入事件驱动控制机制,智能体的动作决策无须按固定周期进行,而依赖于事件驱动条件更新智能体动作。在设计事件驱动条件时,不仅考虑智能体的累积奖赏值,还引入智能体与邻居奖赏值的偏差,智能体间通过交互来寻求最优联合策略实现编队。数值仿真结果表明,基于事件驱动的强化学习多智能体编队控制算法,在保证系统性能的情况下,能有效降低多智能体的动作决策频率和资源消耗。
- Abstract:
-
A large consumption of communication and computing capabilities has been reported in classical reinforcement learning of multi-agent formation. This paper introduces an event-triggered mechanism so that the multi-agent’s decisions do not need to be carried out periodically; instead, the multi-agent’s actions are replaced depending on the event-triggered condition. Both the sum of total reward and variance in current rewards are considered when designing an event-triggered condition, so a joint optimization strategy is obtained by exchanging information among multiple agents. Numerical simulation results demonstrate that the multi-agent formation control algorithm can effectively reduce the frequency of a multi-agent’s action decisions and consumption of resources while ensuring system performance.
备注/Memo
收稿日期:2018-07-11。
基金项目:国家重点研发计划项目(2017YFB1400800);国家自然科学基金项目(91648120,61633002,51575005,61563006,61563005);广西高校工业过程智能控制技术重点实验室项目(IPICT-2016-04).
作者简介:徐鹏,男,1991年生,硕士研究生,主要研究方向为多智能体、强化学习、深度学习;谢广明,男,1972年生,教授,博士生导师,主要研究方向为复杂系统动力学与控制、智能仿生机器人多机器人系统与控制。现主持国家自然基金重点项目3项,发明专利授权10余项。曾荣获教育部自然科学奖一等奖、国家自然科学奖二等奖。发表学术论文300余篇,其中被SCI收录120余篇、EI收录120余篇;文家燕,男,1981年生,副教授,博士,主要研究方向为事件驱动控制、多智能体编队控制。发表学术论文10余篇。
通讯作者:文家燕.E-mail:wenjiayan2012@126.com
更新日期/Last Update:
1900-01-01