[1]张小川,唐艳,梁宁宁.采用时间差分算法的九路围棋机器博弈系统[J].智能系统学报,2012,7(3):278-282.
ZHANG Xiaochuan,TANG Yan,LIANG Ningning.A 9×9 Go computer game system using temporal difference[J].CAAI Transactions on Intelligent Systems,2012,7(3):278-282.
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
7
期数:
2012年第3期
页码:
278-282
栏目:
学术论文—智能系统
出版日期:
2012-06-25
- Title:
-
A 9×9 Go computer game system using temporal difference
- 文章编号:
-
1673-4785(2012)03-0278-05
- 作者:
-
张小川,唐艳,梁宁宁
-
重庆理工大学 计算机科学与工程学院,重庆 400054
- Author(s):
-
ZHANG Xiaochuan, TANG Yan, LIANG Ningning
-
College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, China
-
- 关键词:
-
机器博弈; 九路围棋; 围棋机器博弈; 时间差分算法
- Keywords:
-
computer game; 9×9 Go; Go computer game; temporal difference
- 分类号:
-
TP31
- 文献标志码:
-
A
- 摘要:
-
围棋机器博弈是机器博弈中重要的分支之一,其庞大的博弈空间给机器博弈研究者带来了巨大挑战.目前围棋机器博弈多采用静态估值搜索与蒙特卡洛树搜索,故将时间差分算法引入至九路围棋机器博弈系统中,提出基于时间差分算法的围棋机器博弈系统模型,该博弈系统具有一定的自学习能力,能在不断的对弈中逐步提高博弈能力.通过与采用αβ搜索算法的博弈系统进行实际对弈,证明了该方法的可行性.
- Abstract:
-
Computer Go is an important branch of computer games and presents great challenges to computer game researchers due to its need for huge game space. Presently, the static evaluation method and the MonteCarlo tree search method are widely used in Go computer games. In this paper, a temporal difference algorithm was introduced to the 9×9 Go computer game system which gave it selflearning capability, thereby improving the game levels as a result of the continuous training. Through playing chess with a system which adopts an αβ algorithm, the new method was proven to be effective.
备注/Memo
收稿日期: 2011-10-17.网络出版日期:2012-05-18.
基金项目:重庆市教委科研项目(KJ120824);重庆市自然科学基金资助项目(2007BB2415).
通信作者:张小川. E-mail: cqpczxc@163.com.
作者简介:
张小川,男,1965年生,教授,中国人工智能学会机器博弈专业委员会副主任.主要研究方向为人工智能、人工生命、计算机软件等.主持国家级、省部级项目6项,横向项目30余项,曾获重庆市自然科学奖1项、科技进步奖1项,重庆市教学成果奖1项.主编教材3部,发表学术论文50余篇.
唐艳,女,1987年生,硕士研究生,主要研究方向为计算智能与智能软件.
更新日期/Last Update:
2012-09-05