[1]吴立成,吴启飞,钟宏鸣,等.基于卷积神经网络的“拱猪”博弈算法[J].智能系统学报,2023,18(4):775-782.[doi:10.11992/tis.202203030]
 WU Licheng,WU Qifei,ZHONG Hongming,et al.Algorithm for “Hearts” game based on convolutional neural network[J].CAAI Transactions on Intelligent Systems,2023,18(4):775-782.[doi:10.11992/tis.202203030]
点击复制

基于卷积神经网络的“拱猪”博弈算法

参考文献/References:
[1] BLAIR A, SAFFIDINE A. AI surpasses humans at six-player poker[J]. Science, 2019, 365(6456): 864–865.
[2] MORAV?íK M, SCHMID M, BURCH N, et al. Deepstack: expert-level artificial intelligence in heads-up no-limit poker[J]. Science, 2017, 356(6337): 508–513.
[3] BROWN N, SANDHOLM T. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals[J]. Science, 2018, 359(6374): 418–424.
[4] 彭啟文, 王以松, 于小民, 等. 基于手牌拆分的“斗地主”蒙特卡洛树搜索[J]. 南京师大学报(自然科学版), 2019, 42(3): 107–114
PENG Qiwen, WANG Yisong, YU Xiaomin, et al. Monte Carlo tree search for “Doudizhu” based on hand splitting[J]. Journal of Nanjing Normal University (natural science edition), 2019, 42(3): 107–114
[5] 徐方婧, 魏鲲鹏, 王以松, 等. 基于卷积神经网络的“斗地主”策略[J]. 计算机与现代化, 2020(11): 28–32
XU Fangjing, WEI Kunpeng, WANG Yisong, et al. “Doudizhu” strategy based on convolutional neural networks[J]. Computer and modernization, 2020(11): 28–32
[6] 马骁, 王轩, 王晓龙. 一类非完备信息博弈的信息模型[J]. 计算机研究与发展, 2010, 47(12): 2100–2109
MA Xiao, WANG Xuan, WANG Xiaolong. Information model for a class of incomplete information games[J]. Computer research and development, 2010, 47(12): 2100–2109
[7] 王轩, 许朝阳. 时序差分在非完备信息博弈中的应用[C]//中国机器博弈学术研讨会. 重庆:重庆工学院学报, 2007: 16-22.
WANG Xuan, XU Chaoyang. The application of temporal difference in incomplete information games [C]//China Machine Game Academic Symposium. Chongqing: Journal of Chongqing Institute of Technology, 2007: 16-22.
[8] ZHANG Jiajia, WANG Xuan, YANG Ling, et al. Analysis of UCT algorithm policies in imperfect information game[C]//2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems. Piscataway: IEEE, 2013: 132-137.
[9] ZHANG Jiajia. Building opponent model in imperfect information board games[J]. TELKOMNIKA Indonesian journal of electrical engineering, 2014, 12(3): 1975–1986.
[10] ZHANG Jiajia, WANG Xuan. Using modified UCT algorithm basing on risk estimation methods in imperfect information games[J]. International journal of multimedia and ubiquitous engineering, 2014, 9(10): 23–32.
[11] GINSBERG M L. GIB: imperfect information in a computationally challenging game[J]. Journal of artificial intelligence research, 2001, 14: 303–358.
[12] BOWLING M, BURCH N, JOHANSON M, et al. Computer science. Heads-up limit hold’em poker is solved[J]. Science, 2015, 347(6218): 145–149.
[13] BROWN N, SANDHOLM T, AMOS B. Depth-limited solving for imperfect-information games[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. New York: ACM, 2018: 7674-7685.
[14] SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of go without human knowledge[J]. Nature, 2017, 550(7676): 354–359.
[15] SILVER D, HUBERT T, SCHRITTWIESER J, et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play[J]. Science, 2018, 362(6419): 1140–1144.
[16] 李轶. 德州扑克计算机博弈智能决策模型研究[D]. 重庆: 重庆理工大学, 2020.
LI Yi. Research on intelligent decision model of texas Hold’em computer game[D]. Chongqing: Chongqing University of Technology Graduation Thesis, 2020.
[17] 李轶, 彭丽蓉, 杜松, 等. 一种德州扑克博弈的决策模型[J]. 软件导刊, 2021, 20(5): 16–19
LI Yi, PENG Lirong, DU Song, et al. A decision model for texas Hold’em game[J]. Software guide, 2021, 20(5): 16–19
[18] 张蒙, 李凯, 吴哲, 等. 一种针对德州扑克AI的对手建模与策略集成框架[J]. 自动化学报, 2022, 48(4): 1004–1017
ZHANG Meng, LI Kai, WU Zhe, et al. An opponent modeling and strategy integration framework for Texas Hold’em AI[J]. Chinese journal of automation, 2022, 48(4): 1004–1017
[19] ZHOU Qibin, BAI Dongdong, ZHANG Junge, et al. DecisionHoldem: safe depth-limited solving with diverse opponents for imperfect-information games[EB/OL]. (2022-01-27)[2022-03-17]. https://arxiv.org/abs/2201.11580.
[20] LI Saisai, LI Shuqin, DING Meng, et al. Research on fight the landlords’ single card guessing based on deep learning[M]. Cham: Springer International Publishing, 2018: 363-372.
[21] YOU Yang, LI Liangwei, GUO Baisong, et al. Combinational Q-learning for Dou Di Zhu[EB/OL]. (2019-2-19)[2022-05-19]. https://arxiv.org/pdf/1901.08925v1.pdf.
[22] JIANG Qiqi, LI Kuangzheng, DU Boyao, et al. DeltaDou: expert-level doudizhu AI through self-play[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Hawaii: AAAI Press, 2019: 1265-1271.
[23] 彭啟文. 基于蒙特卡洛树搜索的“斗地主”研究[D]. 贵阳: 贵州大学, 2020.
PENG Qiwen. Research on “Doudizhu” based on Monte Carlo tree search [D]. Guizhou: Graduation Thesis of Guizhou University, 2020.
[24] ZHA Daochen, XIE Jingru, MA Wenye, et al. DouZero: mastering DouDizhu with self-play deep reinforcement learning[EB/OL]. (2021–06–11)[2022–03–17]. https://arxiv.org/abs/2106.06135.
[25] 郭荣城, 李淑琴, 龚元函, 等. 二打一游戏残局模式下的对弈策略研究[J]. 智能计算机与应用, 2022, 12(4): 151–158
GUO Rongcheng, LI Shuqin, GONG Yuanhan, et al. Research on game strategy in two-on-one game endgame mode[J]. Intelligent computer and application, 2022, 12(4): 151–158
[26] YANG Guan, LIU Minghuan, HONG Weijun, et al. PerfectDou: dominating DouDizhu with perfect information distillation[EB/OL]. (2022–03–30)[2022–03–17]. https://arxiv.org/abs/2203.16406.
[27] 《中国华牌竞赛规则》编写组. 中国华牌竞赛规则(试行) [M]. 北京: 人民体育出版社, 2009: 2–4.
相似文献/References:
[1]李德毅.网络时代人工智能研究与发展[J].智能系统学报,2009,4(1):1.
 LI De-yi.AI research and development in the network age[J].CAAI Transactions on Intelligent Systems,2009,4():1.
[2]赵克勤.二元联系数A+Bi的理论基础与基本算法及在人工智能中的应用[J].智能系统学报,2008,3(6):476.
 ZHAO Ke-qin.The theoretical basis and basic algorithm of binary connection A+Bi and its application in AI[J].CAAI Transactions on Intelligent Systems,2008,3():476.
[3]徐玉如,庞永杰,甘?? 永,等.智能水下机器人技术展望[J].智能系统学报,2006,1(1):9.
 XU Yu-ru,PANG Yong-jie,GAN Yong,et al.AUV—state-of-the-art and prospect[J].CAAI Transactions on Intelligent Systems,2006,1():9.
[4]王志良.人工心理与人工情感[J].智能系统学报,2006,1(1):38.
 WANG Zhi-liang.Artificial psychology and artificial emotion[J].CAAI Transactions on Intelligent Systems,2006,1():38.
[5]赵克勤.集对分析的不确定性系统理论在AI中的应用[J].智能系统学报,2006,1(2):16.
 ZHAO Ke-qin.The application of uncertainty systems theory of set pair analysis (SPU)in the artificial intelligence[J].CAAI Transactions on Intelligent Systems,2006,1():16.
[6]秦裕林,朱新民,朱? 丹.Herbert Simon在最后几年里的两个研究方向[J].智能系统学报,2006,1(2):11.
 QIN Yu-lin,ZHU Xin-min,ZHU Dan.Herbert Simons two research directions in his lost years[J].CAAI Transactions on Intelligent Systems,2006,1():11.
[7]谷文祥,李 丽,李丹丹.规划识别的研究及其应用[J].智能系统学报,2007,2(1):1.
 GU Wen-xiang,LI Li,LI Dan-dan.Research and application of plan recognition[J].CAAI Transactions on Intelligent Systems,2007,2():1.
[8]杨春燕,蔡 文.可拓信息-知识-智能形式化体系研究[J].智能系统学报,2007,2(3):8.
 YANG Chun-yan,CAI Wen.A formalized system of extension information-knowledge-intelligence[J].CAAI Transactions on Intelligent Systems,2007,2():8.
[9]赵克勤.SPA的同异反系统理论在人工智能研究中的应用[J].智能系统学报,2007,2(5):20.
 ZHAO Ke-qin.The application of SPAbased identicaldiscrepancycontrary system theory in artificial intelligence research[J].CAAI Transactions on Intelligent Systems,2007,2():20.
[10]王志良,杨?? 溢,杨?? 扬,等.一种周期时变马尔可夫室内位置预测模型[J].智能系统学报,2009,4(6):521.[doi:10.3969/j.issn.1673-4785.2009.06.009]
 WANG Zhi-liang,YANG Yi,YANG Yang,et al.A periodic time-varying Markov model for indoor location prediction[J].CAAI Transactions on Intelligent Systems,2009,4():521.[doi:10.3969/j.issn.1673-4785.2009.06.009]

备注/Memo

收稿日期:2022-03-17。
基金项目:国家自然科学基金项目(61773416,61873291).
作者简介:吴立成,教授,博士生导师,国家民委首批中青年英才培养计划,主要研究方向为智能机器人、计算机博弈、计算语言学。主持国家自然科学基金项目、863项目等10余项,授权发明专利4项,获教育部科技进步二等奖1项、江苏省科技进步三等奖1项。发 表学术论文100余篇,出版专著1部、 教材1部、译著1部。;吴启飞,硕士研究生,主要研究方向为计算机博弈;李霞丽,教授,主要研究方向为机器博弈。主持国家自然科学基金面上项目2项、国家自然科学基金青年项目1项、省部级项目1项,获得北京市高等学校青年英才计划奖励1项,授权发明专利和登记软件著作权10余项。发表学术论文近60篇。
通讯作者:李霞丽.E-mail:xiaer_li@163.com

更新日期/Last Update: 1900-01-01
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com