Mar 30, 2026arXiv:2603.28900

Robust Multi-Agent Reinforcement Learning for Small UAS Separation Assurance under GPS Degradation and Spoofing

Alex Zongo, Alex Zongo, Filippos Fotiadis, Filippos Fotiadis, Ufuk Topcu, U. Topcu, Peng Wei, Peng Wei

AI Summary

This paper tackles the problem of ensuring safe separation between small unmanned aircraft systems (sUAS) when GPS signals are unreliable due to degradation or spoofing attacks. They model the GPS corruption as a zero-sum game and derive a closed-form expression for an adversarial policy that perturbs the observed state to maximally degrade agent safety. By integrating this adversarial policy into a MARL policy gradient algorithm, they achieve near-zero collision rates in high-density simulations even with significant GPS corruption, outperforming non-robust baselines.

Key Contribution

Forget adversarial training: a closed-form solution can make multi-agent RL for drone collision avoidance surprisingly robust to GPS spoofing.

Abstract

We address robust separation assurance for small Unmanned Aircraft Systems (sUAS) under GPS degradation and spoofing via Multi-Agent Reinforcement Learning (MARL). In cooperative surveillance, each aircraft (or agent) broadcasts its GPS-derived position; when such position broadcasts are corrupted, the entire observed air traffic state becomes unreliable. We cast this state observation corruption as a zero-sum game between the agents and an adversary: with probability R, the adversary perturbs the observed state to maximally degrade each agent's safety performance. We derive a closed-form expression for this adversarial perturbation, bypassing adversarial training entirely and enabling linear-time evaluation in the state dimension. We show that this expression approximates the true worst-case adversarial perturbation with second-order accuracy. We further bound the safety performance gap between clean and corrupted observations, showing that it degrades at most linearly with the corruption probability under Kullback-Leibler regularization. Finally, we integrate the closed-form adversarial policy into a MARL policy gradient algorithm to obtain a robust counter-policy for the agents. In a high-density sUAS simulation, we observe near-zero collision rates under corruption levels up to 35%, outperforming a baseline policy trained without adversarial perturbations.

Red-Teaming & Adversarial Robustness Robotics & Embodied AI Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Robust Multi-Agent Reinforcement Learning for Small UAS Separation Assurance under GPS Degradation and Spoofing

Related Papers