[1]殷昌盛,杨若鹏,朱巍,等.多智能体分层强化学习综述[J].智能系统学报,2020,15(4):646-655.[doi:10.11992/tis.201909027]
YIN Changsheng,YANG Ruopeng,ZHU Wei,et al.A survey on multi-agent hierarchical reinforcement learning[J].CAAI Transactions on Intelligent Systems,2020,15(4):646-655.[doi:10.11992/tis.201909027]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
15
期数:
2020年第4期
页码:
646-655
栏目:
综述
出版日期:
2020-07-05
- Title:
-
A survey on multi-agent hierarchical reinforcement learning
- 作者:
-
殷昌盛, 杨若鹏, 朱巍, 邹小飞, 李峰
-
国防科技大学 信息通信学院,湖北 武汉 430010
- Author(s):
-
YIN Changsheng, YANG Ruopeng, ZHU Wei, ZOU Xiaofei, LI Feng
-
School of Information and Communication, National University of Defense Technology, Wuhan 430010, China
-
- 关键词:
-
人工智能; 机器学习; 强化学习; 多智能体; 综述; 深度学习; 分层强化学习; 应用现状
- Keywords:
-
artificial intelligence; machine learning; reinforcement learning; multi-agent; summary; reinforcement learning; hierarchical reinforcement learning; application status
- 分类号:
-
TP18
- DOI:
-
10.11992/tis.201909027
- 摘要:
-
作为机器学习和人工智能领域的一个重要分支,多智能体分层强化学习以一种通用的形式将多智能体的协作能力与强化学习的决策能力相结合,并通过将复杂的强化学习问题分解成若干个子问题并分别解决,可以有效解决空间维数灾难问题。这也使得多智能体分层强化学习成为解决大规模复杂背景下智能决策问题的一种潜在途径。首先对多智能体分层强化学习中涉及的主要技术进行阐述,包括强化学习、半马尔可夫决策过程和多智能体强化学习;然后基于分层的角度,对基于选项、基于分层抽象机、基于值函数分解和基于端到端等4种多智能体分层强化学习方法的算法原理和研究现状进行了综述;最后介绍了多智能体分层强化学习在机器人控制、博弈决策以及任务规划等领域的应用现状。
- Abstract:
-
As an important research area in the field of machine learning and artificial intelligence, multi-agent hierarchical reinforcement learning (MAHRL) integrates the advantages of the collaboration of multi-agent system (MAS) and the decision making of reinforcement learning (RL) in a general-purpose form, and decomposes the RL problem into sub-problems and solves each of them to overcome the so-called curse of dimensionality. So MAHRL offers a potential way to solve large-scale and complex decision problem. In this paper, we systematically describe three key technologies of MAHRL: reinforcement learning (RL), Semi Markov Decision Process (SMDP), multi-agent reinforcement learning (MARL). We then systematically describe four main categories of the MAHRL method from the angle of hierarchical learning, which includes Option, HAM, MAXQ and End-to-End. Finally, we end up with summarizing the application status of MAHRL in robot control, game decision making and mission planning.
更新日期/Last Update:
2020-07-25