Mar 16, 2026arXiv:2603.15189

The Sampling Complexity of Condorcet Winner Identification in Dueling Bandits

El Mehdi Saad, Victor Thuot, Nicolas Verzelen

AI Summary

This paper introduces a new best-arm identification procedure for stochastic dueling bandits, assuming only the existence of a Condorcet winner. The procedure leverages the full gap matrix, $Δ_{i,j}$, to improve sample complexity guarantees compared to methods that only consider gaps between the Condorcet winner and other arms. The authors derive high-probability, instance-dependent sample complexity guarantees and establish new lower bounds, demonstrating the optimality of their non-asymptotic bounds and revealing new regimes in sample complexity.

Key Contribution

Finding the best arm in a dueling bandit setting gets a serious upgrade: a new algorithm exploits all pairwise comparisons, slashing sample complexity and revealing limitations in existing asymptotic analyses.

Abstract

We study best-arm identification in stochastic dueling bandits under the sole assumption that a Condorcet winner exists, i.e., an arm that wins each noisy pairwise comparison with probability at least $1/2$. We introduce a new identification procedure that exploits the full gap matrix $Δ_{i,j}=q_{i,j}-\tfrac12$ (where $q_{i,j}$ is the probability that arm $i$ beats arm $j$), rather than only the gaps between the Condorcet winner and the other arms. We derive high-probability, instance-dependent sample-complexity guarantees that (up to logarithmic factors) improve the best known ones by leveraging informative comparisons beyond those involving the winner. We complement these results with new lower bounds which, to our knowledge, are the first for Condorcet-winner identification in stochastic dueling bandits. Our lower-bound analysis isolates the intrinsic cost of locating informative entries in the gap matrix and estimating them to the required confidence, establishing the optimality of our non-asymptotic bounds. Overall, our results reveal new regimes and trade-offs in the sample complexity that are not captured by asymptotic analyses based only on the expected budget.

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

The Sampling Complexity of Condorcet Winner Identification in Dueling Bandits

Related Papers