<-Previous Article Next Article->

[1]WU Licheng,WU Qifei,ZHONG Hongming,et al.Algorithm for “Hearts” game based on convolutional neural network[J].CAAI Transactions on Intelligent Systems,2023,18(4):775-782.[doi:10.11992/tis.202203030]

Copy

Algorithm for “Hearts” game based on convolutional neural network

PDF Download HTML

CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume: 18 Number of periods: 2023 4 Page number: 775-782 Column: 学术论文—智能系统 Public date: 2023-07-15

Title:: Algorithm for “Hearts” game based on convolutional neural network

Author(s):: WU Licheng; WU Qifei; ZHONG Hongming; WANG Shiyao; LI Xiali; School of Information Engineering, Minzu University of China, Beijing 100081, China

Keywords:: artificial intelligence; game of incomplete information; deep learning; convolutional neural network; Hearts; Chinese card game; card-showing; card-playing

CLC:: TP183；G892

DOI:: 10.11992/tis.202203030

Abstract:: “Hearts”, also known as “Chinese card game”, is a very characteristic poker game, which belongs to incomplete information games. It consists of two stages of card showdown and card playing, and there is strong reversality throughout the game. In order to study the computer game algorithm of “Hearts”, this paper proposes a “Hearts” game algorithm based on deep learning, which includes two neural networks, namely, card showdown and card playing, which are used in card showdown and card playing stage respectively. Both the card showdown network and card playing network are constructed by convolutional neural network (CNN), which are designed into different network structures according to their functional characteristics. Two CNN networks are trained, tested, and analyzed by using the real card playing patterns of 11,000 human advanced players to generate training data and test data proportionally. The results show that the accuracy of card showdown and card playing network reaches 88.4% and 71.4% respectively. The analysis of some specific examples of card showdown and card playing shows that the algorithm is able to produce reasonable card showdown and card playing strategies.

References:: [1] BLAIR A, SAFFIDINE A. AI surpasses humans at six-player poker[J]. Science, 2019, 365(6456): 864–865.
[2] MORAV?íK M, SCHMID M, BURCH N, et al. Deepstack: expert-level artificial intelligence in heads-up no-limit poker[J]. Science, 2017, 356(6337): 508–513.
[3] BROWN N, SANDHOLM T. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals[J]. Science, 2018, 359(6374): 418–424.
[4] 彭啟文, 王以松, 于小民, 等. 基于手牌拆分的“斗地主”蒙特卡洛树搜索[J]. 南京师大学报(自然科学版), 2019, 42(3): 107–114
PENG Qiwen, WANG Yisong, YU Xiaomin, et al. Monte Carlo tree search for “Doudizhu” based on hand splitting[J]. Journal of Nanjing Normal University (natural science edition), 2019, 42(3): 107–114
[5] 徐方婧, 魏鲲鹏, 王以松, 等. 基于卷积神经网络的“斗地主”策略[J]. 计算机与现代化, 2020(11): 28–32
XU Fangjing, WEI Kunpeng, WANG Yisong, et al. “Doudizhu” strategy based on convolutional neural networks[J]. Computer and modernization, 2020(11): 28–32
[6] 马骁, 王轩, 王晓龙. 一类非完备信息博弈的信息模型[J]. 计算机研究与发展, 2010, 47(12): 2100–2109
MA Xiao, WANG Xuan, WANG Xiaolong. Information model for a class of incomplete information games[J]. Computer research and development, 2010, 47(12): 2100–2109
[7] 王轩, 许朝阳. 时序差分在非完备信息博弈中的应用[C]//中国机器博弈学术研讨会. 重庆：重庆工学院学报, 2007: 16-22.
WANG Xuan, XU Chaoyang. The application of temporal difference in incomplete information games [C]//China Machine Game Academic Symposium. Chongqing: Journal of Chongqing Institute of Technology, 2007: 16-22.
[8] ZHANG Jiajia, WANG Xuan, YANG Ling, et al. Analysis of UCT algorithm policies in imperfect information game[C]//2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems. Piscataway: IEEE, 2013: 132-137.
[9] ZHANG Jiajia. Building opponent model in imperfect information board games[J]. TELKOMNIKA Indonesian journal of electrical engineering, 2014, 12(3): 1975–1986.
[10] ZHANG Jiajia, WANG Xuan. Using modified UCT algorithm basing on risk estimation methods in imperfect information games[J]. International journal of multimedia and ubiquitous engineering, 2014, 9(10): 23–32.
[11] GINSBERG M L. GIB: imperfect information in a computationally challenging game[J]. Journal of artificial intelligence research, 2001, 14: 303–358.
[12] BOWLING M, BURCH N, JOHANSON M, et al. Computer science. Heads-up limit hold’em poker is solved[J]. Science, 2015, 347(6218): 145–149.
[13] BROWN N, SANDHOLM T, AMOS B. Depth-limited solving for imperfect-information games[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. New York: ACM, 2018: 7674-7685.
[14] SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of go without human knowledge[J]. Nature, 2017, 550(7676): 354–359.
[15] SILVER D, HUBERT T, SCHRITTWIESER J, et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play[J]. Science, 2018, 362(6419): 1140–1144.
[16] 李轶. 德州扑克计算机博弈智能决策模型研究[D]. 重庆: 重庆理工大学, 2020.
LI Yi. Research on intelligent decision model of texas Hold’em computer game[D]. Chongqing: Chongqing University of Technology Graduation Thesis, 2020.
[17] 李轶, 彭丽蓉, 杜松, 等. 一种德州扑克博弈的决策模型[J]. 软件导刊, 2021, 20(5): 16–19
LI Yi, PENG Lirong, DU Song, et al. A decision model for texas Hold’em game[J]. Software guide, 2021, 20(5): 16–19
[18] 张蒙, 李凯, 吴哲, 等. 一种针对德州扑克AI的对手建模与策略集成框架[J]. 自动化学报, 2022, 48(4): 1004–1017
ZHANG Meng, LI Kai, WU Zhe, et al. An opponent modeling and strategy integration framework for Texas Hold’em AI[J]. Chinese journal of automation, 2022, 48(4): 1004–1017
[19] ZHOU Qibin, BAI Dongdong, ZHANG Junge, et al. DecisionHoldem: safe depth-limited solving with diverse opponents for imperfect-information games[EB/OL]. (2022-01-27)[2022-03-17]. https://arxiv.org/abs/2201.11580.
[20] LI Saisai, LI Shuqin, DING Meng, et al. Research on fight the landlords’ single card guessing based on deep learning[M]. Cham: Springer International Publishing, 2018: 363-372.
[21] YOU Yang, LI Liangwei, GUO Baisong, et al. Combinational Q-learning for Dou Di Zhu[EB/OL]. (2019-2-19)[2022-05-19]. https://arxiv.org/pdf/1901.08925v1.pdf.
[22] JIANG Qiqi, LI Kuangzheng, DU Boyao, et al. DeltaDou: expert-level doudizhu AI through self-play[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Hawaii: AAAI Press, 2019: 1265-1271.
[23] 彭啟文. 基于蒙特卡洛树搜索的“斗地主”研究[D]. 贵阳: 贵州大学, 2020.
PENG Qiwen. Research on “Doudizhu” based on Monte Carlo tree search [D]. Guizhou: Graduation Thesis of Guizhou University, 2020.
[24] ZHA Daochen, XIE Jingru, MA Wenye, et al. DouZero: mastering DouDizhu with self-play deep reinforcement learning[EB/OL]. (2021–06–11)[2022–03–17]. https://arxiv.org/abs/2106.06135.
[25] 郭荣城, 李淑琴, 龚元函, 等. 二打一游戏残局模式下的对弈策略研究[J]. 智能计算机与应用, 2022, 12(4): 151–158
GUO Rongcheng, LI Shuqin, GONG Yuanhan, et al. Research on game strategy in two-on-one game endgame mode[J]. Intelligent computer and application, 2022, 12(4): 151–158
[26] YANG Guan, LIU Minghuan, HONG Weijun, et al. PerfectDou: dominating DouDizhu with perfect information distillation[EB/OL]. (2022–03–30)[2022–03–17]. https://arxiv.org/abs/2203.16406.
[27] 《中国华牌竞赛规则》编写组. 中国华牌竞赛规则(试行) [M]. 北京: 人民体育出版社, 2009: 2–4.

Similar References:

Memo

Last Update: 1900-01-01

Algorithm for “Hearts” game based on convolutional neural network PDF DownloadHTML

Memo

Algorithm for “Hearts” game based on convolutional neural network

PDF Download HTML