[1]连传强,徐昕,吴军,等.面向资源分配问题的Q-CF多智能体强化学习[J].智能系统学报,2011,6(2):95-100.
 LIAN Chuanqiang,XU Xin,WU Jun,et al.Q-CF multiAgent reinforcement learningfor resource allocation problems[J].CAAI Transactions on Intelligent Systems,2011,6(2):95-100.
点击复制

面向资源分配问题的Q-CF多智能体强化学习

参考文献/References:
[1]CHONGJIE Z, LESSER V, SHENOY P. A multiAgent learning approach to resource sharing across computing clusters[R].Computer Science Department, University of Massachusetts Computer Science Amherst UMass, UMCS2008035, 2008.
[2]KO P C, LIN P C, YOU J A, et al. Multilayer allocated learning based neural network for resource allocation optimization[C]// Proceedings of the 9th Joint Conference on Information Sciences(JCIS 2006). Taibei, China, 2006: 3541.
[3]TESAURO G. Online resource allocation using decompositional reinforcement learning[C]//Proceedings of AAAI 2005. Pittsburgh, USA, 2005: 886891.
[4]LITTMAN M L, STONE P. Leading bestresponse strategies in repeated games[C]//The 17th Annual International Joint Conference on Artificial Intelligence Workshop on Economic Agents, Models, and Mechanism. Seattle, Washington, USA, 2001: 745756.
[5]HU J, WELLMAN M P. Multiagent reinforcement learning in stochastic games[OL]. Citeseer. ist. psu. edu/hu99multiagent. Html, 1999.
[6]BUSONIU L, De SCHUTTER B, BABUSKA R. Multiagent reinforcement learning with adaptive state focus[C]//Proceedings of the 17th BelgiumNetherlands Conference on Artificial Intelligence. Brussels, Belgium, 2005: 3542.
[7]KOK J R, VLASSIS N. Collaborative multiagent reinforcement learning by payoff propagation[J]. Journal of Machine Learning Research, 2006, 7: 17891828.
[8]杨佩,陈兆乾,陈世福. 机器学习在RoboCup中的应用研究[J].计算机科学, 2003, 30(6): 118121. YANG Pei, CHEN Zhaoqian, CHEN Shifu. RoboCup multiAgent system machinelearning[J].Computer Sciences, 2003, 30(6): 118121.
[9]王醒策,张汝波,顾国昌. 基于强化学习的多机器人编队方法研究[J].计算机工程, 2002, 28(6): 1516. WANG Xingce, ZHANG Rubo, GU Guochang. Research on multiAgent team formation based on reinforcement learning[J].Computer Engineering, 2002, 28(6): 1516.
[10]HU J, WELLMAN M P. Nash Qlearning for generalsum stochastic games[J]. Journal of Machine Learning Research, 2003, 4: 10391069.
[11]ALPAYDM E. 机器学习导论[M]. 范明,等译. 北京:北京工业出版社, 2009: 244255.
?[12]LAGOUDAKIS M G, PARR R. Leastsquares policy iteration[J]. Journal of Machine Learning Research, 2003 (4): 11071149.
[13]XU X, HU D W, LU X C. Kernel based leastsquares policy iteration[J]. IEEE Transactions on Neural Networks, 2007, 18(4): 973992.
相似文献/References:
[1]沈 晶,顾国昌,刘海波.基于多智能体的Option自动生成算法[J].智能系统学报,2006,1(1):84.
 SHEN Jing,GU Guo-chang,LIU Hai-bo.Algorithm for automatic constructing Option based on multi-agent[J].CAAI Transactions on Intelligent Systems,2006,1():84.
[2]李宗刚,贾英民.一类具有群体LEADER的多智能体系统的聚集行为[J].智能系统学报,2006,1(2):26.
 LI Zong-gang,JIA Ying-min.Aggregation of MultiAgent systems with group leaders[J].CAAI Transactions on Intelligent Systems,2006,1():26.
[3]王建春,谢广明.含有障碍物环境下多智能体系统的聚集行为[J].智能系统学报,2007,2(5):78.
 WANG Jian-chun,XIE Guang-ming.Aggregation behaviors of multiAgent systems in an environment with obstacles[J].CAAI Transactions on Intelligent Systems,2007,2():78.
[4]王 龙,伏 锋,陈小杰,等.复杂网络上的群体决策[J].智能系统学报,2008,3(2):95.
 WANG Long,FU Feng,CHEN Xiao-jie,et al.Collective decision-making over complex networks[J].CAAI Transactions on Intelligent Systems,2008,3():95.
[5]王冬梅,方华京.反馈控制策略的自适应群集运动[J].智能系统学报,2011,6(2):141.
 WANG Dongmei,FANG Huajing.An adaptive flocking motion with a leader based on a feedback control scheme[J].CAAI Transactions on Intelligent Systems,2011,6():141.
[6]董洁,纪志坚,王晓晓.多智能体网络系统的能控性代数条件[J].智能系统学报,2015,10(5):747.[doi:10.11992/tis.201411030]
 DONG Jie,JI Zhijian,WANG Xiaoxiao.Algebraic conditions for the controllability of multi-agent systems[J].CAAI Transactions on Intelligent Systems,2015,10():747.[doi:10.11992/tis.201411030]
[7]梁爽,曹其新,王雯珊,等.基于强化学习的多定位组件自动选择方法[J].智能系统学报,2016,11(2):149.[doi:10.11992/tis.201510031]
 LIANG Shuang,CAO Qixin,WANG Wenshan,et al.An automatic switching method for multiple location components based on reinforcement learning[J].CAAI Transactions on Intelligent Systems,2016,11():149.[doi:10.11992/tis.201510031]
[8]王中林,刘忠信,陈增强,等.一种多智能体领航跟随编队新型控制器的设计[J].智能系统学报,2014,9(3):298.[doi:10.3969/j.issn.1673-4785.]
 WANG Zhonglin,LIU Zhongxin,CHEN Zengqiang,et al.A kind of new type controller for multi-agent leader-follower formation[J].CAAI Transactions on Intelligent Systems,2014,9():298.[doi:10.3969/j.issn.1673-4785.]
[9]王晓晓,纪志坚.广播信号下非一致多智能体系统的能控性[J].智能系统学报,2014,9(4):401.[doi:10.3969/j.issn.1673-4785.201401011]
 WANG Xiaoxiao,JI Zhijian.Controllability of non-identical multi-agent systems under a broadcasting control signal[J].CAAI Transactions on Intelligent Systems,2014,9():401.[doi:10.3969/j.issn.1673-4785.201401011]
[10]马晨,陈雪波.基于包含原理的多智能体一致性协调控制[J].智能系统学报,2014,9(4):468.[doi:10.3969/j.issn.1673-4785.201306024]
 MA Chen,CHEN Xuebo.Coordinated control of the consensus of a multi-agent system based on the inclusion principle[J].CAAI Transactions on Intelligent Systems,2014,9():468.[doi:10.3969/j.issn.1673-4785.201306024]

备注/Memo

收稿日期:2010-03-25.
基金项目:国家自然科学基金资助项目(60774076,90820302).
通信作者:连传强.
E-mail:wzdslcq@163.com.
作者简介:
连传强,男,1986年生,硕士研究生,主要研究方向为模式识别与机器学习.
徐昕,男,1974年生,研究员,博士,主要研究方向为增强学习、自适应动态规划理论和算法、智能移动机器人、智能系统.
吴军,男,1980年生,博士研究生.主要研究方向为多机器人系统、机器学习与智能系统.

更新日期/Last Update: 2011-05-19
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com