[1]OUYANG Yongping,WEI Changyun,CAI Boliang.Collision avoidance approach for distributed heterogeneous multirobot systems in dynamic environments[J].CAAI Transactions on Intelligent Systems,2022,17(4):752-763.[doi:10.11992/tis.202106044]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
17
Number of periods:
2022 4
Page number:
752-763
Column:
学术论文—智能系统
Public date:
2022-07-05
- Title:
-
Collision avoidance approach for distributed heterogeneous multirobot systems in dynamic environments
- Author(s):
-
OUYANG Yongping1; WEI Changyun1; CAI Boliang1; 2
-
1. College of Mechanical and Electrical Engineering, Hohai University, Changzhou 213022, China;
2. School of Engineering, Cardiff University, Cardiff CF10 3AT, UK
-
- Keywords:
-
heterogeneous multi-robot systems; deep reinforcement learning; non-structural environment; multi-feature policy gradients; dynamic collision avoidance; self-learning; distributed control; control policy
- CLC:
-
TP273+.2
- DOI:
-
10.11992/tis.202106044
- Abstract:
-
Multirobot systems have been widely used in cooperative search and rescue missions, intelligent warehouses, intelligent transportation, and other fields. At present, the path planning and collision avoidance problems between multiple robots and the dynamic environment still rely on accurate maps, which brings challenges to the coordination and cooperation of multirobot systems in unstructured environments. To address the above problem, this paper presents a navigation and collision avoidance approach that does not require accurate maps and is based on the deep reinforcement learning framework. A multifeatured policy gradients algorithm is proposed in this work, and social norms are also integrated so that the learning agent can obtain the optimal control policy via trial-and-error interactions with the environment. The optimal policy is trained and obtained in the Gazebo environment, and afterward, the optimal policy is transferred to several heterogeneous real robots by decoding the control signals. The experimental results show that the multifeature policy gradients algorithm proposed can obtain the optimal navigation collision avoidance policy through self-learning, and it provides a technical reference for the application of distributed heterogeneous multirobot systems in dynamic environments.