Search papers, labs, and topics across Lattice.
This paper analyzes the exploitability of Follow-the-Regularized-Leader (FTRL) learners with constant step size in two-player zero-sum games against a clairvoyant optimizer. It demonstrates that exploitability is an inherent property of the FTRL family, scaling with the number of suboptimal actions taken by the learner and persisting even against alternating optimizers. The analysis reveals a geometric dichotomy based on the steepness of the regularizer, influencing how quickly the optimizer can exploit the learner.
FTRL learners are inherently exploitable in two-player games, regardless of equilibrium structure, revealing a fundamental weakness in this widely used optimization strategy.
In this paper we investigate the exploitability of a Follow-the-Regularized-Leader (FTRL) learner with constant step size $\eta$ in $n\times m$ two-player zero-sum games played over $T$ rounds against a clairvoyant optimizer. In contrast with prior analysis, we show that exploitability is an inherent feature of the FTRL family, rather than an artifact of specific instantiations. First, for fixed optimizer, we establish a sweeping law of order $\Omega(N/\eta)$, proving that exploitation scales to the number of the learner's suboptimal actions $N$ and vanishes in their absence. Second, for alternating optimizer, a surplus of $\Omega(\eta T/\mathrm{poly}(n,m))$ can be guaranteed regardless of the equilibrium structure, with high probability, in random games. Our analysis uncovers once more the sharp geometric dichotomy: non-steep regularizers allow the optimizer to extract maximum surplus via finite-time elimination of suboptimal actions, whereas steep ones introduce a vanishing correction that may delay exploitation. Finally, we discuss whether this leverage persists under bilateral payoff uncertainty and we propose susceptibility measure to quantify which regularizers are most vulnerable to strategic manipulation.