[1]ZHOU Wenji,YU Yang.Summarize of hierarchical reinforcement learning[J].CAAI Transactions on Intelligent Systems,2017,12(5):590-594.[doi:10.11992/tis.201706031]

Summarize of hierarchical reinforcement learning

[1] BARTO A G, MAHADEVAN S. Recent advances in hierarchical reinforcement learning[J]. Discrete event dynamic systems, 2013,13(4):341-379.
[2] YAN Q, LIU Q, HU D. A hierarchical reinforcement learning algorithm based on heuristic reward function[C]//In Proceedings of 2nd International Conference on Advanced Computer Control. Shenyang, China, 2010, 3:371-376.
[3] DETHLEFS N, CUAYáHUITL H. Combining hierarchical reinforcement learning and Bayesian networks for natural language generation in situated dialogue[C]//European Workshop on Natural Language Generation. Nancy, France,2011:110-120.
[4] AL-EMRAN M. Hierarchical reinforcement learning:a survey[J]. International journal of computing and digital systems, 2015, 4(2):137-143.
[5] MAHADEVAN S, MARCHALLECK N. Self-improving factory simulation using continuous-time average-reward reinforcement learning[C]. In Proceedings of the Machine Learning International Workshop. Nashville, USA, 1997:202-210.
[6] HOWARD R A. Semi-Markov and decision processes[M]. New York:DOVER Publications, 2007.
[7] GIL P, NUNES L. Hierarchical reinforcement learning using path clustering[C]//In Proceedings of 8th Iberian Conference on Information Systems and Technologies. Lisboa, Portugal, 2013:1-6.
[8] STULP F, SCHAAL S. Hierarchical reinforcement learning with movement primitives[C]//In Proceedings of 11th IEEE-RAS International Conference on Humanoid Robots. Bled, Slovenia, 2011:231-238.
[9] DU X, LI Q, HAN J. Applying hierarchical reinforcement learning to computer games[C]//In Proceedings of IEEE International Conference on Automation and Logistics. Xi’an, China, 2009:929-932.
[10] SUTTON R S, PRECUP D, SINGH S. Between MDPs and semi-MDPs:a framework for temporal abstraction in reinforcement learning[J]. Artificial intelligence, 1999, 112(1/2):181-211.
[11] PARR R, RUSSELL S. Reinforcement learning with hierarchies of machines[C]//Advances in Neural Information Processing Systems. Colorado, USA, 1998:1043-1049.
[12] DIETTERICH T G. Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. Journal of artificial intelligence research, 2000, 13:227-303.
[13] MOHSEN G, TAGHIZADEH N, et al. Automatic abstraction in reinforcement learning using ant system algorithm[C]//In Proceedings of AAAI Spring Symposium:Lifelong Machine Learning. Stanford, USA, 2013:114-122.
[14] PIERRE-LUC BACON, JEAN HARB. The option-critic architecture[C]//In Proceeding of 31th AAAI Conference on Artificial Intelligence. San Francisco, USA, 2017:1726-1734.
[15] VEZHNEVETS A S, OSINDERO S, SCHAUL T, et al. FeUdal networks for hierarchical reinforcement learning[C]//In Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia, 2017:3540-3549.
[16] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(2):529-533.
[17] TEJAS D. K, KARTHNIK N, ARDAVAN S, et al. Hierarchical deep reinforcement learning:integrating temporal abstraction and intrinsic motivation[C]//Annual Conference on Neural Information Processing Systems. Barcelona, Spain, 2016:3675-3683.
[18] CARLOS FLORENSA, YAN D, PIETER A. Stochastic neural networks for hierarchical reinforcement learning[EB/OL]. Berkeley, USA, arXiv. 2017, https://arxiv.org/pdf/1704.03012.pdf.
[19] LAKSHMINARAYANAN A S, KRISHNAMURTHY R, KUMAR P, et al. Option discovery in hierarchical reinforcement learning using spatio-temporal clustering[EB/OL]. Madras, India, arXiv, 2016, https://arxiv.org/pdf/1605.05359.pdf.
[20] XUE B, GLEN B. DeepLoco:dynamic locomotion skills using hierarchical deep reinforcement learning[J]. ACM transactions on graphics,2017, 36(4):1-13.
Similar References:



Last Update: 2017-10-25

Copyright © CAAI Transactions on Intelligent Systems