Search papers, labs, and topics across Lattice.
This paper introduces a computationally efficient online mirror descent algorithm for linear contextual bandits that is robust to both adversarial corruption and heavy-tailed noise with finite $(1+ε)$-th moments. The algorithm achieves $\mathcal{O}(1)$ computational cost per round, significantly improving upon the $\mathcal{O}(t\log T)$ cost of existing methods. The established regret bound consists of terms depending on the noise's $(1+ε)$-moment and the total corruption, matching existing guarantees under finite-variance assumptions when $ε=1$ and achieving best-known rates for heavy-tailed noise when no corruption is present.
Forget slow bandits: this new algorithm slashes per-round computation to O(1) while staying robust against adversarial corruption and heavy-tailed noise.
We study linear contextual bandits under adversarial corruption and heavy-tailed noise with finite $(1+ε)$-th moments for some $ε\in (0,1]$. Existing work that addresses both adversarial corruption and heavy-tailed noise relies on a finite variance (i.e., finite second-moment) assumption and suffers from computational inefficiency. We propose a computationally efficient algorithm based on online mirror descent that achieves robustness to both adversarial corruption and heavy-tailed noise. While the existing algorithm incurs $\mathcal{O}(t\log T)$ computational cost, our algorithm reduces this to $\mathcal{O}(1)$ per round. We establish an additive regret bound consisting of a term depending on the $(1+ε)$-moment bound of the noise and a term depending on the total amount of corruption. In particular, when $ε= 1$, our result recovers existing guarantees under finite-variance assumptions. When no corruption is present, it matches the best-known rates for linear contextual bandits with heavy-tailed noise. Moreover, the algorithm requires no prior knowledge of the noise moment bound or the total amount of corruption and still guarantees sublinear regret.