PurdueFeb 18, 2026arXiv:2602.16183

Multi-Agent Combinatorial-Multi-Armed-Bandit framework for the Submodular Welfare Problem under Bandit Feedback

Subham Pokhriyal, Shweta Jain, Vaneet Aggarwal

AI Summary

This paper addresses the Submodular Welfare Problem (SWP) under bandit feedback, extending it to a multi-agent combinatorial bandit (MA-CMAB) framework where agents are coupled through shared allocation constraints. They tackle the challenge of maximizing total welfare when agents have monotone submodular utilities and only receive bandit feedback on their allocations. The authors propose an explore-then-commit strategy with randomized assignments, achieving a regret of $\tilde{\mathcal{O}}(T^{2/3})$ compared to a $(1-1/e)$ benchmark, representing the first such guarantee for partition-based SWP under bandit feedback.

Key Contribution

First regret bound for partition-based submodular welfare under bandit feedback is achieved by a novel multi-agent combinatorial bandit framework.

Abstract

We study the \emph{Submodular Welfare Problem} (SWP), where items are partitioned among agents with monotone submodular utilities to maximize the total welfare under \emph{bandit feedback}. Classical SWP assumes full value-oracle access, achieving $(1-1/e)$ approximations via continuous-greedy algorithms. We extend this to a \emph{multi-agent combinatorial bandit} framework (\textsc{MA-CMAB}), where actions are partitions under full-bandit feedback with non-communicating agents. Unlike prior single-agent or separable multi-agent CMAB models, our setting couples agents through shared allocation constraints. We propose an explore-then-commit strategy with randomized assignments, achieving $\tilde{\mathcal{O}}(T^{2/3})$ regret against a $(1-1/e)$ benchmark, the first such guarantee for partition-based submodular welfare problem under bandit feedback.

Recommendation & Information Retrieval Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Multi-Agent Combinatorial-Multi-Armed-Bandit framework for the Submodular Welfare Problem under Bandit Feedback

Related Papers