Mar 2, 2026arXiv:2603.02154

Boltzmann-based Exploration for Robust Decentralized Multi-Agent Planning

Nhat Nguyen, Duong Nguyen, Gianluca Rizzo, Hung Nguyen

AI Summary

The paper introduces Coordinated Boltzmann MCTS (CB-MCTS), a novel decentralized multi-agent planning algorithm designed to improve exploration in sparse or skewed reward environments. CB-MCTS replaces the deterministic UCT action selection in Dec-MCTS with a stochastic Boltzmann policy and a decaying entropy bonus to encourage sustained and focused exploration. The authors demonstrate through simulations that CB-MCTS outperforms Dec-MCTS in deceptive scenarios while maintaining competitive performance on standard benchmarks, showcasing its robustness.

Key Contribution

Boltzmann exploration, previously limited to single-agent systems, now powers a robust decentralized multi-agent planner that conquers deceptive reward landscapes.

Abstract

Decentralized Monte Carlo Tree Search (Dec-MCTS) is widely used for cooperative multi-agent planning but struggles in sparse or skewed reward environments. We introduce Coordinated Boltzmann MCTS (CB-MCTS), which replaces deterministic UCT with a stochastic Boltzmann policy and a decaying entropy bonus for sustained yet focused exploration. While Boltzmann exploration has been studied in single-agent MCTS, applying it in multi-agent systems poses unique challenges. CB-MCTS is the first to address this. We analyze CB-MCTS in the simple-regret setting and show in simulations that it outperforms Dec-MCTS in deceptive scenarios and remains competitive on standard benchmarks, providing a robust solution for multi-agent planning.

Robotics & Embodied AI Tool Use & Agents World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...