[1]张文旭,马磊,贺荟霖,等.强化学习的地-空异构多智能体协作覆盖研究[J].智能系统学报,2018,13(2):202-207.[doi:10.11992/tis.201609017]
ZHANG Wenxu,MA Lei,HE Huilin,et al.Air-ground heterogeneous coordination for multi-agent coverage based on reinforced learning[J].CAAI Transactions on Intelligent Systems,2018,13(2):202-207.[doi:10.11992/tis.201609017]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
13
期数:
2018年第2期
页码:
202-207
栏目:
学术论文—机器学习
出版日期:
2018-04-15
- Title:
-
Air-ground heterogeneous coordination for multi-agent coverage based on reinforced learning
- 作者:
-
张文旭, 马磊, 贺荟霖, 王晓东
-
西南交通大学 电气工程学院, 四川 成都 610031
- Author(s):
-
ZHANG Wenxu, MA Lei, HE Huilin, WANG Xiaodong
-
School of Electrical Engineering, Southwest Jiaotong University, Chengdu 610031, China
-
- 关键词:
-
异构多智能体; 覆盖问题; 地-空; UAV/UGV; DEC-POMDPs; 强化学习
- Keywords:
-
heterogeneous multi-agent system; coverage; air-ground; UAV/UGV; DEC-POMDPs; reinforced learning
- 分类号:
-
TP181
- DOI:
-
10.11992/tis.201609017
- 摘要:
-
以无人机(unmanned aerial vehicle, UAV)和无人车(unmanned ground vehicle, UGV)的异构协作任务为背景,通过UAV和UGV的异构特性互补,为了扩展和改进异构多智能体的动态覆盖问题,提出了一种地-空异构多智能体协作覆盖模型。在覆盖过程中,UAV可以利用速度与观测范围的优势对UGV的行动进行指导;同时考虑智能体的局部观测性与不确定性,以分布式局部可观测马尔可夫(decentralized partially observable Markov decision processes,DEC-POMDPs)为模型搭建覆盖场景,并利用多智能体强化学习算法完成对环境的覆盖。仿真实验表明,UAV与 UGV间的协作加快了团队对环境的覆盖速度,同时强化学习算法也提高了覆盖模型的有效性。
- Abstract:
-
With the heterogeneous coordinate task of unmanned aerial vehicles (UAVs) and unmanned ground vehicle (UGVs) as the background to this study, a novel air-ground heterogeneous coverage model for a coordinated multi-agent is proposed by the complementation between UAV and UGV heterogeneity, in order to extend and improve the dynamic coverage of a heterogeneous multi-agent system. During the coverage process, the advantages of mobility and the observation scope of the UAV were used in order to guide the actions of the UGV. Moreover, in view of the partial agent observability and uncertainty, decentralized and partially observable Markov decision processes (DEC-POMDPs) were applied as the model in order to establish the coverage environment. Additionally, the reinforced learning algorithm of multi-agents was utilized in order to complete the coverage of the environment. The simulation results revealed that the coverage process was accelerated by the cooperation of the UAV and UGV. Additionally, the reinforced learning algorithm also improved the effectiveness of the coverage model.
备注/Memo
收稿日期:2016-09-21。
基金项目:国家自然科学基金青年基金项目(61304166).
作者简介:张文旭,男,1985年生,博士研究生,主要研究方向为多智能体系统、机器学习,发表学术论文4篇,其中被EI检索4篇;马磊,男,1972年生,教授,博士,主要研究方向为控制理论及其在机器人、新能源和轨道交通系统中的应用等,主持国内外项目14项,发表学术论文40余篇,其中被EI检索37篇;贺荟霖,女,1993年生,硕士研究生,主要研究方向为机器学习;王晓东,男,1992年生,硕士研究生,主要研究方向为机器学习,获得国家发明型专利3项,发表学术论文4篇。
通讯作者:张文旭.E-mail:wenxu_zhang@163.com.
更新日期/Last Update:
1900-01-01