[1]蒋胤傑,况琨,吴飞.大数据智能:从数据拟合最优解到博弈对抗均衡解[J].智能系统学报,2020,15(1):175-182.[doi:10.11992/tis.201911007]
 JIANG Yinjie,KUANG Kun,WU Fei.Big data intelligence: from the optimal solution of data fitting to the equilibrium solution of game theory[J].CAAI Transactions on Intelligent Systems,2020,15(1):175-182.[doi:10.11992/tis.201911007]
点击复制

大数据智能:从数据拟合最优解到博弈对抗均衡解(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第15卷
期数:
2020年1期
页码:
175-182
栏目:
人工智能院长论坛
出版日期:
2020-01-01

文章信息/Info

Title:
Big data intelligence: from the optimal solution of data fitting to the equilibrium solution of game theory
作者:
蒋胤傑12 况琨12 吴飞12
1. 浙江大学 计算机科学与技术学院, 浙江 杭州 310027;
2. 浙江大学 人工智能研究所, 浙江 杭州 310027
Author(s):
JIANG Yinjie12 KUANG Kun12 WU Fei12
1. College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China;
2. Institute of Artificial Intelligence, Zhejiang University, Hangzhou 310027, China
关键词:
人工智能大数据最优拟合神经网络结构搜索博弈论纳什均衡
Keywords:
artificial intelligencebig dataoptimal fittingneural network architecture searchgame theoryNash equilibrium
分类号:
TP391
DOI:
10.11992/tis.201911007
摘要:
数据驱动的机器学习(特别是深度学习)在自然语言处理、计算机视觉分析和语音识别等领域取得了巨大进展,是人工智能研究的热点。但是传统机器学习是通过各种优化算法拟合训练数据集上的最优模型,即在模型上的平均损失最小,而在现实生活的很多问题(如商业竞拍、资源分配等)中,人工智能算法学习的目标应该是是均衡解,即在动态情况下也有较好效果。这就需要将博弈的思想应用于大数据智能。通过蒙特卡洛树搜索和强化学习等方法,可以将博弈与人工智能相结合,寻求博弈对抗模型的均衡解。从数据拟合的最优解到博弈对抗的均衡解能让大数据智能有更广阔的应用空间。
Abstract:
Data-driven machine learning (especially deep learning), which is a hot topic in artificial intelligence research, has made great progress in the fields of natural language processing, computer vision analysis and speech recognition, etc. The optimization of parameters in traditional machine learning can be regarded as the process of data fitting, the optimal model on the training data set is fitted by various optimization algorithms. However, in real applications such as commodity bidding and resource allocation, the target of artificial intelligence algorithm is not an optimal solution, but an equilibrium solution, which requires the application of the game theory to big data intelligence. Combining game theory with artificial intelligence can expand the application space of big data intelligence.

参考文献/References:

[1] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, USA, 2012: 1097–1105.
[2] DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009: 248–255.
[3] YUILLE A L, LIU Chenxi. Deep nets: what have they ever done for vision?[J]. arXiv: 1805.04025, 2018.
[4] MOOSAVI-DEZFOOLI S M, FAWZI A, FAWZI O, et al. Universal adversarial perturbations[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017: 86–94.
[5] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484–489.
[6] BROWN N, SANDHOLM T. Superhuman AI for heads-up no-limit poker: libratus beats top professionals[J]. Science, 2018, 359(6374): 418–424.
[7] BLAIR A, SAFFIDINE A. AI surpasses humans at six-player poker[J]. Science, 2019, 365(6456): 864–865.
[8] ARULKUMARAN K, CULLY A, TOGELIUS J. Alphastar: an evolutionary computation perspective[J]. arXiv: 1902.01724, 2019.
[9] ROSENBLATT F. Principles of neurodynamics: perceptrons and the theory of brain mechanisms[R]. Washington: Spartan, 1961.
[10] WERBOS P. New tools for prediction and analysis in the behavioral sciences[D]. Cambridge: Harvard University, 1974.
[11] WERBOS P J. Backpropagation through time: what it does and how to do it[J]. Proceedings of the IEEE, 1990, 78(10): 1550–1560.
[12] RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors[J]. Nature, 1986, 323(6088): 533–536.
[13] CORTES C, VAPNIK V. Support-vector networks[J]. Machine learning, 1995, 20(3): 273–297.
[14] FREUND Y, SCHAPIRE R E. Experiments with a new boosting algorithm[C]//Proceedings of the 13th International Conference on Machine Learning. Bari, Italy, 1996: 148–156.
[15] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504–507.
[16] ROBBINS H, MONRO S. A stochastic approximation method[J]. The annals of mathematical statistics, 1951, 22(3): 400–407.
[17] DEVLIN J, CHANG Mingwei, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv: 1810.04805, 2018.
[18] ZOPH B, LE Q V. Neural architecture search with reinforcement learning[C]//Proceedings of 5th International Conference on Learning Representations. Toulon, France, 2017.
[19] BAKER B, GUPTA O, NAIK N, et al. Designing neural network architectures using reinforcement learning[C]//Proceedings of International Conference on Learning Representations. Toulon, France, 2017.
[20] CAI Han, CHEN Tianyao, ZHANG Weinan, et al. Efficient architecture search by network transformation[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, USA, 2018.
[21] SUGANUMA M, SHIRAKAWA S, NAGAO T. A genetic programming approach to designing convolutional neural network architectures[C]//Proceedings of Genetic and Evolutionary Computation Conference. Berlin, Germany, 2017: 497–504.
[22] CAI Han, YANG Jiacheng, ZHANG Weinan, et al. Path-level network transformation for efficient architecture search[C]//Proceedings of the 35th International Conference on Machine Learning. Stockholmsm?ssan, Stockholm, Sweden, 2018: 677–686.
[23] ELSKEN T, METZEN J H, HUTTER F. Efficient multi-objective neural architecture search via lamarckian evolution[C]//Proceedings of 2019 International Conference on Learning Representations. New Orleans, USA, 2019.
[24] ZOPH B, VASUDEVAN V, SHLENS J, et al. Learning transferable architectures for scalable image recognition[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 8697–8710.
[25] REAL E, AGGARWAL A, HUANG Yanping, et al. Regularized evolution for image classifier architecture search[J]. AAAI technical track: machine learning, 2019, 33(1): 4780–4789.
[26] VON NEUMANN J, MORGENSTERN O, KUHN H W, et al. Theory of games and economic behavior[M]. Princeton: Princeton University Press, 2007.
[27] NASH JR J F. Equilibrium points in n-person games[J]. Proceedings of the national academy of sciences of the United States of America, 1950, 36(1): 48–49.
[28] TANG Pingzhong. Reinforcement mechanism design[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne, Australia, 2017: 5146–5150.
[29] PéROLAT J, LEIBO J Z, ZAMBALDI V, et al. A multi-agent reinforcement learning model of common-pool resource appropriation[C]//Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach, USA, 2017: 3643–3652.
[30] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, USA, 2014: 2672–2680.
[31] SUTTON R S, MCALLESTER D, SINGH S, et al. Policy gradient methods for reinforcement learning with function approximation[C]//Proceedings of the 12th International Conference on Neural Information Processing Systems. Cambridge, USA, 1999: 1057–1063.
[32] KOCSIS L, SZEPESVáRI C. Bandit based monte-carlo planning[C]//Proceedings of the 17th European Conference on Machine Learning. Berlin, Germany, 2006: 282–293.
[33] SANDHOLM T. Solving imperfect-information games[J]. Science, 2015, 347(6218): 122–123.
[34] RACANIèRE S, WEBER T, REICHERT D P, et al. Imagination-augmented agents for deep reinforcement learning[C]//Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach, USA, 2017: 5690–5701.
[35] ZINKEVICH M, JOHANSON M, BOWLING M, et al. Regret minimization in games with incomplete information[C]//Proceedings of the 20th International Conference on Neural Information Processing Systems. Red Hook, USA, 2007: 1729–1736.
[36] BROWN N, SANDHOLM T. Safe and nested subgame solving for imperfect-information games[C]//Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach, USA, 2017: 689–699.
[37] VINYALS O, BABUSCHKIN I, CZARNECKI W M, et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning[J]. Nature, 2019, 575(7782): 350–354.

相似文献/References:

[1]李德毅.网络时代人工智能研究与发展[J].智能系统学报,2009,4(01):1.
 LI De-yi.AI research and development in the network age[J].CAAI Transactions on Intelligent Systems,2009,4(1):1.
[2]赵克勤.二元联系数A+Bi的理论基础与基本算法及在人工智能中的应用[J].智能系统学报,2008,3(06):476.
 ZHAO Ke-qin.The theoretical basis and basic algorithm of binary connection A+Bi and its application in AI[J].CAAI Transactions on Intelligent Systems,2008,3(1):476.
[3]徐玉如,庞永杰,甘 永,等.智能水下机器人技术展望[J].智能系统学报,2006,1(01):9.
 XU Yu-ru,PANG Yong-jie,GAN Yong,et al.AUV—state-of-the-art and prospect[J].CAAI Transactions on Intelligent Systems,2006,1(1):9.
[4]王志良.人工心理与人工情感[J].智能系统学报,2006,1(01):38.
 WANG Zhi-liang.Artificial psychology and artificial emotion[J].CAAI Transactions on Intelligent Systems,2006,1(1):38.
[5]赵克勤.集对分析的不确定性系统理论在AI中的应用[J].智能系统学报,2006,1(02):16.
 ZHAO Ke-qin.The application of uncertainty systems theory of set pair analysis (SPU)in the artificial intelligence[J].CAAI Transactions on Intelligent Systems,2006,1(1):16.
[6]秦裕林,朱新民,朱 丹.Herbert Simon在最后几年里的两个研究方向[J].智能系统学报,2006,1(02):11.
 QIN Yu-lin,ZHU Xin-min,ZHU Dan.Herbert Simons two research directions in his lost years[J].CAAI Transactions on Intelligent Systems,2006,1(1):11.
[7]谷文祥,李 丽,李丹丹.规划识别的研究及其应用[J].智能系统学报,2007,2(01):1.
 GU Wen-xiang,LI Li,LI Dan-dan.Research and application of plan recognition[J].CAAI Transactions on Intelligent Systems,2007,2(1):1.
[8]杨春燕,蔡 文.可拓信息-知识-智能形式化体系研究[J].智能系统学报,2007,2(03):8.
 YANG Chun-yan,CAI Wen.A formalized system of extension information-knowledge-intelligence[J].CAAI Transactions on Intelligent Systems,2007,2(1):8.
[9]赵克勤.SPA的同异反系统理论在人工智能研究中的应用[J].智能系统学报,2007,2(05):20.
 ZHAO Ke-qin.The application of SPAbased identicaldiscrepancycontrary system theory in artificial intelligence research[J].CAAI Transactions on Intelligent Systems,2007,2(1):20.
[10]王志良,杨 溢,杨 扬,等.一种周期时变马尔可夫室内位置预测模型[J].智能系统学报,2009,4(06):521.[doi:10.3969/j.issn.1673-4785.2009.06.009]
 WANG Zhi-liang,YANG Yi,YANG Yang,et al.A periodic time-varying Markov model for indoor location prediction[J].CAAI Transactions on Intelligent Systems,2009,4(1):521.[doi:10.3969/j.issn.1673-4785.2009.06.009]
[11]马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报,2016,11(6):728.[doi:10.11992/tis.201611021]
 MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,11(1):728.[doi:10.11992/tis.201611021]
[12]苗夺谦,张清华,钱宇华,等.从人类智能到机器实现模型——粒计算理论与方法[J].智能系统学报,2016,11(6):743.[doi:10.11992/tis.201612014]
 MIAO Duoqian,ZHANG Qinghua,QIAN Yuhua,et al.From human intelligence to machine implementation model: theories and applications based on granular computing[J].CAAI Transactions on Intelligent Systems,2016,11(1):743.[doi:10.11992/tis.201612014]

备注/Memo

备注/Memo:
收稿日期:2019-11-11。
基金项目:国家杰出青年科学基金(61625707);国家自然科学基金人工智能基础研究应急管理项目(61751209)
作者简介:蒋胤傑,博士研究生,主要研究方向为人工智能、神经网络结构搜索;况琨,助理教授,主要研究方向为因果推理、稳定学习、可解释性机器学习以及AI在医学和法学的相关应用。曾担任NIPS、AAAI、CIKM、ICDM等国际学术会议程序委员会委员。发表10余篇顶级会议和期刊文章,包括KDD、ICML、MM、AAAI、TKDD等;吴飞,教授,博士生导师,浙江大学人工智能研究所所长,担任中国图象图形学学会第七届理事会理事、中国图象图形学学会动画与数字娱乐专委会副主任、中国计算机学会多媒体技术专业委员会常务委员。主要研究方向为人工智能、跨媒体计算、多媒体分析与检索和统计学习理论。曾获国家杰出青年科学基金、宝钢优秀教师奖,入选教育部新世纪优秀人才支持计划、“高校计算机专业优秀教师奖励计划”,浙江省151人才工程第一层次培养人员,教育部人工智能科技创新专家组工作组组长。发表学术论文70余篇
通讯作者:蒋胤傑.E-mail:jiangyinjie@zju.edu.cn
更新日期/Last Update: 1900-01-01