Search papers, labs, and topics across Lattice.
This paper introduces a learning-enhanced auction-consensus framework for multi-robot task allocation (MRTA) that replaces the hand-crafted bidding function of CBBA with a learned neural bidding policy. Proximal Policy Optimization (PPO) is used to train bidding policies based on local observations, with rewards shaped by proximity to globally optimal solutions. Experiments demonstrate that learned bidding policies, particularly those using Neural Additive Models, LSTMs, or Set Transformers, improve solution quality over classical CBBA while maintaining decentralized execution.
Reinforcement learning can teach robot swarms to bid more effectively in task allocation auctions, outperforming hand-crafted heuristics without sacrificing decentralized control.
Multi-Robot Task Allocation (MRTA) is a central challenge in decentralized multi-agent systems, where teams of robots must cooperatively assign and execute tasks under limited communication while optimizing global performance objectives. Auction-consensus algorithms, such as the Consensus-Based Bundle Algorithm (CBBA), provide scalable decentralized coordination with provable convergence, but rely on hand-crafted greedy scoring functions that often lead to suboptimal task allocations. This paper proposes a learning-enhanced auction-consensus framework in which CBBA's deterministic bidding mechanism is replaced by a neural bidding policy trained using reinforcement learning. Under a centralized training and decentralized execution paradigm, agents learn to compute task bids from partial local observations while retaining the standard auction and consensus phases for decentralized coordination. The learned bidding policy is trained using Proximal Policy Optimization with rewards shaped by proximity to globally optimal solutions obtained via mixed-integer linear programming. Multiple neural architectures are evaluated, including a Neural Additive Model, the Long Short-Term Memory (LSTM) model, and the Set Transformer Model. Experimental results across varying swarm sizes demonstrate that learned bidding policies can improve solution quality over classical CBBA while preserving decentralized execution. The proposed approach highlights the effectiveness of integrating reinforcement learning with classical distributed coordination algorithms, offering a scalable pathway toward higher-quality decentralized multi-robot task allocation.