[1]WANG Xue-ning,CHEN Wei,ZHANG Men,et al.A survey of direct policy search methods in reinforcement learning[J].CAAI Transactions on Intelligent Systems,2007,2(1):16-24.
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
2
Number of periods:
2007 1
Page number:
16-24
Column:
综述
Public date:
2007-02-25
- Title:
-
A survey of direct policy search methods in reinforcement learning
- Author(s):
-
WANG Xue-ning1; CHEN Wei1; ZHANG Men2; XU Xin1; HE Han-gen1
-
1. School of Electromechanical Engineering and Automation, National University o f Defense Technology, Changsha 410073, China;
2. Qinghe Building Zi 9, Bei jing 10008 5, China
-
- Keywords:
-
reinforcement learning; policy search; policy Gradien t
- CLC:
-
TP242
- DOI:
-
-
- Abstract:
-
The direct policy search methods in reinforcement learn ing are described, and the theoretic framework of policy gradient meth ods is presented. According to this framework, some current policy gradient algo rithms are generalized. The new methods of speeding up the policy gradient al gorithms are discussed. The new nonpolicy gradient search methods are also described. Finally, some future directions of research work are also given.