[1]YONG Yuchen,LI Ziyu,DONG Qi.Multi-UAV within-visual-range air combat based on hierarchical multiagent reinforcement learning[J].CAAI Transactions on Intelligent Systems,2025,20(3):548-556.[doi:10.11992/tis.202408008]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
20
Number of periods:
2025 3
Page number:
548-556
Column:
学术论文—机器学习
Public date:
2025-05-05
- Title:
-
Multi-UAV within-visual-range air combat based on hierarchical multiagent reinforcement learning
- Author(s):
-
YONG Yuchen1; 2; LI Ziyu3; DONG Qi2
-
1. College of Software Engineering, Southeast University, Nanjing 211189, China;
2. Electronic Science Research Institute of China Electronics Technology Group Corporation, Beijing 100041, China;
3. School of Information Science and Engineering, Southeast University, Nanjing 210096, China
-
- Keywords:
-
air combat within visual range; dogfight; autonomous decision-making; self-play; hierarchical reinforcement learning; multi-intelligent body game; hierarchical decision networks; reward function design
- CLC:
-
TP18
- DOI:
-
10.11992/tis.202408008
- Abstract:
-
To improve the autonomous maneuvering decision-making capabilities of unmanned aerial vehicles (UAVs) in within-visual-range air combat, a hierarchical decision network framework based on self-play theory (SP) and multiagent reinforcement learning (MARL) is proposed in this paper. A multi-UAV dogfight scenario is studied by combining SP and an MARL algorithm. The complex air combat task is divided into upper-level missile strike tasks and lower-level flight tracking tasks, which effectively reduces the fuzziness of tactical action and improves the autonomous maneuvering decision-making ability in a multi-UAV dogfight scenario. In addition, through an innovative reward function design and by adopting the SP method, the algorithm reduces the meaningless exploration of an agent due to the large battlefield environment. Simulation results show that this algorithm can help agents learn basic flight tactics and advanced combat tactics and has better defensive and offensive capabilities compared with other multiagent air combat algorithms.