Search papers, labs, and topics across Lattice.
This paper introduces a graph-based multi-agent reinforcement learning (MARL) framework for decentralized cooperative UAV deployment, trained with centralized training and decentralized execution (CTDE). The architecture uses an agent-entity attention module to encode local state and nearby entities, and aggregates inter-UAV messages with neighbor self-attention over a distance-limited communication graph. Evaluated on cooperative relay (DroneConnect) and adversarial engagement (DroneCombat) tasks, the method achieves high coverage under restricted communication and generalizes to unseen team sizes.
UAV swarms can achieve near-optimal cooperative deployment and generalize to new team sizes using a communication-aware MARL approach, even with limited communication and partial observability.
Autonomous Unmanned Aerial Vehicle (UAV) swarms are increasingly used as rapidly deployable aerial relays and sensing platforms, yet practical deployments must operate under partial observability and intermittent peer-to-peer links. We present a graph-based multi-agent reinforcement learning framework trained under centralized training with decentralized execution (CTDE): a centralized critic and global state are available only during training, while each UAV executes a shared policy using local observations and messages from nearby neighbors. Our architecture encodes local agent state and nearby entities with an agent-entity attention module, and aggregates inter-UAV messages with neighbor self-attention over a distance-limited communication graph. We evaluate primarily on a cooperative relay deployment task (DroneConnect) and secondarily on an adversarial engagement task (DroneCombat). In DroneConnect, the proposed method achieves high coverage under restricted communication and partial observation (e.g. 74% coverage with M = 5 UAVs and N = 10 nodes) while remaining competitive with a mixed-integer linear programming (MILP) optimization-based offline upper bound, and it generalizes to unseen team sizes without fine-tuning. In the adversarial setting, the same framework transfers without architectural changes and improves win rate over non-communicating baselines.