[1]张小川,唐艳,梁宁宁.采用时间差分算法的九路围棋机器博弈系统[J].智能系统学报,2012,7(3):278-282.
 ZHANG Xiaochuan,TANG Yan,LIANG Ningning.A 9×9 Go computer game system using temporal difference[J].CAAI Transactions on Intelligent Systems,2012,7(3):278-282.
点击复制

采用时间差分算法的九路围棋机器博弈系统

参考文献/References:
[1]张聪品,刘春红,徐久成.博弈树启发式搜索的αβ剪枝技术研究[J].计算机工程与应用, 2008, 44(16): 5455, 97.
 ZHANG Congpin, LIU Chunhong, XU Jiucheng. Research on alphabeta pruning of heuristic search in gameplaying tree[J]. Computer Engineering and Applications, 2008,44(16): 5455, 97.
[2]刘知青,李文峰.现代计算机围棋基础[M].北京:北京邮电大学出版社, 2011: 6380.
[3]GELLY S, WANG Yizao, MUNOS R, et al. Modification of UCT with patterns in MonteCarlo Go[R/OL]. [20111015]. http://219.142.86.87/paper/RR6062.pdf.
[4]GELLY S, WANG Yizao. Exploration exploitation in Go: UCT for MonteCarlo Go[C/OL]. [20111015]. http://wenku.baidu.com/view/66c2edd6b9f3f90f76c61bc0.html.
[5]张汝波,周宁,顾国昌,等.基于强化学习的智能机器人避碰方法研究[J].机器人, 1995, 21 (3): 204209.
ZHANG Rubo, ZHOU Ning, GU Guochang, et al. Reinforcement learning based obstacle avoidance learning for intelligent robot[J]. Robot, 1995, 21 (3): 204209.
[6]沈晶,顾国昌,刘海波.基于免疫聚类的自动分层强化学习方法研究[J].哈尔滨工程大学学报, 2007, 28(4): 423428.
SHEN Jing, GU Guochang, LIU Haibo. Hierarchical reinforcement learning with an automatically generated hierarchy based on immune clustering[J]. Journal of Harbin Engineering University, 2007, 28(4): 423428.
[7]BAE J, CHHATBAR P, FRANCIS J T, et al. Reinforcement learning via kernel temporal difference[C]//Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Boston, USA, 2011: 56625665.
[8]SUTTON R S. Learning to predict by the methods of temporal difference[J]. Machine Learning, 1988, 3(1): 944.
[9]KAELBLING L P, LITTMAN M L, MOORE A W. Reinforcement learning: a survey[J]. Journal of Artificial Intelligence Research, 1996, 4: 237285.
[10]阿培丁.机器学习导论[M].范明,昝红英,牛常勇,译.北京:机械工业出版社, 2009: 372390.
[11]SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. Cambridge, USA: The MIT Press, 1997.
[12]聂卫平,冯大树.聂卫平围棋道场[M].北京:北京体育大学出版社, 2004.
[13]徐长明,马宗民,徐心和,等.面向机器博弈的即时差分学习研究[J].计算机科学, 2010, 37(8): 219224.
XU Changming, MA Zongmin, XU Xinhe, et al. Study of temporal difference learning in computer games[J]. Computer Science, 2010, 37(8): 219224.
相似文献/References:
[1]徐长明,南晓斐,王 骄,等.中国象棋机器博弈的时间自适应分配策略研究[J].智能系统学报,2006,1(2):39.
 XU Chang-ming,NAN Xiao-fei,WANG Jiao,et al.Adaptive time allocation strategy in? computer game of Chinese Chess[J].CAAI Transactions on Intelligent Systems,2006,1():39.
[2]徐心和,邓志立,王骄,等.机器博弈研究面临的各种挑战[J].智能系统学报,2008,3(4):287.
 XU Xin-he,DENG Zhi-li,WANG Jiao,et al.Challenging issues facing computer game research[J].CAAI Transactions on Intelligent Systems,2008,3():287.
[3]李学俊,王小龙,吴蕾,等.六子棋中基于局部“路”扫描方式的博弈树生成算法[J].智能系统学报,2015,10(2):267.[doi:10.3969/j.issn.1673-4785.201401022]
 LI Xuejun,WANG Xiaolong,WU Lei,et al.Game tree generation algorithm based on local-road scanning method for connect 6[J].CAAI Transactions on Intelligent Systems,2015,10():267.[doi:10.3969/j.issn.1673-4785.201401022]
[4]张小川,王宛宛,彭丽蓉.一种军棋机器博弈的多棋子协同博弈方法[J].智能系统学报,2020,15(2):399.[doi:10.11992/tis.201812012]
 ZHANG Xiaochuan,WANG Wanwan,PENG Lirong.A multi-chess collaborative game method for military chess game machine[J].CAAI Transactions on Intelligent Systems,2020,15():399.[doi:10.11992/tis.201812012]
[5]李霞丽,王昭琦,刘博,等.麻将博弈AI构建方法综述[J].智能系统学报,2023,18(6):1143.[doi:10.11992/tis.202211028]
 LI Xiali,WANG Zhaoqi,LIU Bo,et al.Survey of Mahjong game AI construction methods[J].CAAI Transactions on Intelligent Systems,2023,18():1143.[doi:10.11992/tis.202211028]

备注/Memo

收稿日期: 2011-10-17.网络出版日期:2012-05-18.
基金项目:重庆市教委科研项目(KJ120824);重庆市自然科学基金资助项目(2007BB2415).
通信作者:张小川. E-mail: cqpczxc@163.com.
作者简介:
张小川,男,1965年生,教授,中国人工智能学会机器博弈专业委员会副主任.主要研究方向为人工智能、人工生命、计算机软件等.主持国家级、省部级项目6项,横向项目30余项,曾获重庆市自然科学奖1项、科技进步奖1项,重庆市教学成果奖1项.主编教材3部,发表学术论文50余篇.
唐艳,女,1987年生,硕士研究生,主要研究方向为计算智能与智能软件.

更新日期/Last Update: 2012-09-05
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com