[1]ZHANG Xiaochuan,TANG Yan,LIANG Ningning.A 9×9 Go computer game system using temporal difference[J].CAAI Transactions on Intelligent Systems,2012,7(3):278-282.
Copy

A 9×9 Go computer game system using temporal difference

References:
[1]张聪品,刘春红,徐久成.博弈树启发式搜索的αβ剪枝技术研究[J].计算机工程与应用, 2008, 44(16): 5455, 97.
 ZHANG Congpin, LIU Chunhong, XU Jiucheng. Research on alphabeta pruning of heuristic search in gameplaying tree[J]. Computer Engineering and Applications, 2008,44(16): 5455, 97.
[2]刘知青,李文峰.现代计算机围棋基础[M].北京:北京邮电大学出版社, 2011: 6380.
[3]GELLY S, WANG Yizao, MUNOS R, et al. Modification of UCT with patterns in MonteCarlo Go[R/OL]. [20111015]. http://219.142.86.87/paper/RR6062.pdf.
[4]GELLY S, WANG Yizao. Exploration exploitation in Go: UCT for MonteCarlo Go[C/OL]. [20111015]. http://wenku.baidu.com/view/66c2edd6b9f3f90f76c61bc0.html.
[5]张汝波,周宁,顾国昌,等.基于强化学习的智能机器人避碰方法研究[J].机器人, 1995, 21 (3): 204209.
ZHANG Rubo, ZHOU Ning, GU Guochang, et al. Reinforcement learning based obstacle avoidance learning for intelligent robot[J]. Robot, 1995, 21 (3): 204209.
[6]沈晶,顾国昌,刘海波.基于免疫聚类的自动分层强化学习方法研究[J].哈尔滨工程大学学报, 2007, 28(4): 423428.
SHEN Jing, GU Guochang, LIU Haibo. Hierarchical reinforcement learning with an automatically generated hierarchy based on immune clustering[J]. Journal of Harbin Engineering University, 2007, 28(4): 423428.
[7]BAE J, CHHATBAR P, FRANCIS J T, et al. Reinforcement learning via kernel temporal difference[C]//Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Boston, USA, 2011: 56625665.
[8]SUTTON R S. Learning to predict by the methods of temporal difference[J]. Machine Learning, 1988, 3(1): 944.
[9]KAELBLING L P, LITTMAN M L, MOORE A W. Reinforcement learning: a survey[J]. Journal of Artificial Intelligence Research, 1996, 4: 237285.
[10]阿培丁.机器学习导论[M].范明,昝红英,牛常勇,译.北京:机械工业出版社, 2009: 372390.
[11]SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. Cambridge, USA: The MIT Press, 1997.
[12]聂卫平,冯大树.聂卫平围棋道场[M].北京:北京体育大学出版社, 2004.
[13]徐长明,马宗民,徐心和,等.面向机器博弈的即时差分学习研究[J].计算机科学, 2010, 37(8): 219224.
XU Changming, MA Zongmin, XU Xinhe, et al. Study of temporal difference learning in computer games[J]. Computer Science, 2010, 37(8): 219224.
Similar References:

Memo

-

Last Update: 2012-09-05

Copyright © CAAI Transactions on Intelligent Systems