[1]周文吉,俞扬.分层强化学习综述[J].智能系统学报,2017,12(5):590-594.[doi:10.11992/tis.201706031]
ZHOU Wenji,YU Yang.Summarize of hierarchical reinforcement learning[J].CAAI Transactions on Intelligent Systems,2017,12(5):590-594.[doi:10.11992/tis.201706031]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
12
期数:
2017年第5期
页码:
590-594
栏目:
综述
出版日期:
2017-10-25
- Title:
-
Summarize of hierarchical reinforcement learning
- 作者:
-
周文吉, 俞扬
-
南京大学 软件新技术国家重点实验室, 江苏 南京 210023
- Author(s):
-
ZHOU Wenji, YU Yang
-
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
-
- 关键词:
-
人工智能; 机器学习; 强化学习; 分层强化学习; 深度强化学习; 马尔可夫决策过程; 半马尔可夫决策过程; 维度灾难
- Keywords:
-
artificial intelligence; machine learning; reinforcement learning; hierarchical reinforcement learning; deep reinforcement learning; Markov decision process; semi-Markov decision process; dimensional curse
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.201706031
- 摘要:
-
强化学习(reinforcement learning)是机器学习和人工智能领域的重要分支,近年来受到社会各界和企业的广泛关注。强化学习算法要解决的主要问题是,智能体如何直接与环境进行交互来学习策略。但是当状态空间维度增加时,传统的强化学习方法往往面临着维度灾难,难以取得好的学习效果。分层强化学习(hierarchical reinforcement learning)致力于将一个复杂的强化学习问题分解成几个子问题并分别解决,可以取得比直接解决整个问题更好的效果。分层强化学习是解决大规模强化学习问题的潜在途径,然而其受到的关注不高。本文将介绍和回顾分层强化学习的几大类方法。
- Abstract:
-
Reinforcement Learning (RL) is an important research area in the field of machine learning and artificial intelligence and has received increasing attentions in recent years. The goal in RL is to maximize long-term total reward by interacting with the environment. Traditional RL algorithms are limited due to the so-called curse of dimensionality, and their learning abilities degrade drastically with increases in the dimensionality of the state space. Hierarchical reinforcement learning (HRL) decomposes the RL problem into sub-problems and solves each of them to improve learning ability. HRL offers a potential way to solve large-scale RL, which has received insufficient attention to date. In this paper, we introduce and review several main HRL methods.
备注/Memo
收稿日期:2017-06-09。
基金项目:国家自然科学基金项目(61375061);江苏省自然科学基金项目(BK20160066).
作者简介:周文吉,男,1995年生,硕士研究生,主要研究方向为强化学习和数据挖掘;俞扬,男,1982年生,副教授,博士生导师,主要研究方向为人工智能、机器学习、演化计算、数据挖掘。曾获2013年全国优秀博士学位论文奖,2011年中国计算机学会优秀博士学位论文奖。发表论文40余篇。
通讯作者:俞扬.E-mail:yuy@nju.edu.cn
更新日期/Last Update:
2017-10-25