SUSTechJan 1, 2026

Multi-Agent Reinforcement Learning With Spatial Structure Awareness for Topological Map-Based Path-Finding

AI Summary

This paper introduces a novel Multi-Agent Reinforcement Learning (MARL) framework for topological Multi-Agent Path Finding (MAPF) that leverages graph-structured Partially Observable Markov Decision Processes (POMDPs). The framework incorporates a Breadth-First Neighbor-Limited Search (BFNLS) algorithm for scalable observation/action spaces and a Graph Structure Awareness (GSA) model that combines spectral and spatial analysis to capture both local and global topological information. The proposed method, using Value Decomposition Networks (VDN) for cooperative MARL, demonstrates improved success rates and planning efficiency in simulations and real-robot experiments compared to existing approaches.

Key Contribution

Topological maps can unlock scalable multi-agent pathfinding, and this MARL framework combining spectral graph analysis with spatial GCNs proves it.

Abstract

Efficient Multi-Agent Path Finding (MAPF) is pivotal for warehouse logistics. While existing learning-based methods primarily rely on computationally intensive grid-based representations, topological maps offer a more flexible and scalable alternative - though this approach remains understudied. To address this gap, we propose a novel Multi-Agent Reinforcement Learning (MARL) framework for topological MAPF with three key innovations: (1) a graph-structured POMDP formulation utilizing our Breadth-First Neighbor-Limited Search (BFNLS) algorithm to define scalable observation/action spaces while maintaining fixed dimension; (2) a Graph Structure Awareness (GSA) model that combines spectral (eigenvalue-based) and spatial (graph convolutional network-based) analysis to integrate local subgraph features with global topological importance metrics; and (3) a cooperative MARL architecture employing Value Decomposition Networks (VDN) to explicitly model agent dependencies through graph-aware credit assignment. Simulation results show our method achieves superior success rates compared to baseline methods and planning efficiency than search-based methods, and the real-robot experiments show the effectiveness in a physical setting.

Reasoning & Chain-of-Thought Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References19

Year2026

VenueIEEE Robotics and Automation Letters

Related Papers

Finding related papers...