German University in Cairo (GUC)StuttgartJun 1, 2026arXiv:2606.02107

Network Distributed Multi-Agent Reinforcement Learning for Consensus Control of Quadcopters

Youssef Mahran, Zeyad Gamal, Aamir Ahmad, Ayman El-Badawy

AI Summary

This paper introduces a Network Distributed Multi-Agent Reinforcement Learning (ND-MARL) framework specifically designed for quadcopter consensus control, which integrates swarm communication into the decision-making process. By employing a 2-Neighbor communication topology, each quadcopter agent utilizes information from only two neighbors to execute actions through a distributed policy, leading to effective consensus trajectories. The results show that the ND-MARL framework not only outperforms centralized MARL controllers but also exhibits zero-shot scalability, allowing policies trained on small agent groups to be applied to larger swarms without retraining.

Key Contribution

Zero-shot scalability allows quadcopter policies trained on just three agents to seamlessly control swarms of up to 250, revolutionizing multi-agent coordination.

Abstract

This paper proposes a Network Distributed Multi-Agent Reinforcement Learning (ND-MARL) framework for quadcopter consensus control. Compared to conventional multi-agent MARL formulations that rely on centralized planning or fully decentralized execution, ND-MARL incorporates the swarm communication graph into the decision process. Under a 2-Neighbor communication topology, each agent observes information of only two neighbors and outputs an action through a distributed policy. A high-level distributed consensus planner is trained using Multi-Agent Soft Actor-Critic (MASAC) and embedded in a hierarchical stack to generate reference target positions tracked by a low-level quadcopter controller. Results demonstrate smooth consensus trajectories and planner-tracker integration when compared to a centralized MARL controller. Most notably, the learned controller exhibits zero-shot scalability, as policies trained on a three-agent system are deployed to swarms of up to 250 agents under the same 2-Neighbor communication topology without retraining or fine-tuning, achieving consistent convergence with increasing steady-state spread at large team sizes due to sparse information propagation. These findings highlight ND-MARL as a stable framework for distributed, communication-aware quadcopter consensus control.

Distributed Systems & Hardware Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Network Distributed Multi-Agent Reinforcement Learning for Consensus Control of Quadcopters

Related Papers