CMU MLTexas A&MMar 10, 2026arXiv:2603.09168

Better Bounds for the Distributed Experts Problem

AI Summary

This paper investigates the distributed experts problem, aiming to minimize regret while optimizing communication efficiency across multiple servers. They analyze the scenario where expert losses are defined by the $\ell_p$ norm of losses across servers at each timestep. The authors present a novel protocol achieving improved regret bounds of roughly $R\gtrsim\frac{1}{\sqrt{T}\cdot\text{poly}\log(nsT)}$ with $\mathcal{O}\left(\frac{n}{R^2}+\frac{s}{R^2}\right)\cdot\max(s^{1-2/p},1)\cdot\text{poly}\log(nsT)$ bits of communication, surpassing existing approaches.

Key Contribution

Forget shaving yaks – this new protocol slashes communication costs in distributed expert learning while *improving* regret bounds.

Abstract

In this paper, we study the distributed experts problem, where $n$ experts are distributed across $s$ servers for $T$ timesteps. The loss of each expert at each time $t$ is the $\ell_p$ norm of the vector that consists of the losses of the expert at each of the $s$ servers at time $t$. The goal is to minimize the regret $R$, i.e., the loss of the distributed protocol compared to the loss of the best expert, amortized over the all $T$ times, while using the minimum amount of communication. We give a protocol that achieves regret roughly $R\gtrsim\frac{1}{\sqrt{T}\cdot\text{poly}\log(nsT)}$, using $\mathcal{O}\left(\frac{n}{R^2}+\frac{s}{R^2}\right)\cdot\max(s^{1-2/p},1)\cdot\text{poly}\log(nsT)$ bits of communication, which improves on previous work.

Distributed Systems & Hardware Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References32

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Better Bounds for the Distributed Experts Problem

Related Papers