Search papers, labs, and topics across Lattice.
This paper revisits Angluin's model of learning from equivalence queries, where a learner proposes hypotheses and receives counterexamples when inadequate. It addresses the pessimism of fully adversarial counterexample generation and the unnatural assumption of full-information feedback by restricting the environment to "symmetric" counterexample generators, which choose counterexamples based only on the symmetric difference between the hypothesis and the target. The authors derive tight bounds on the number of learning rounds under both full-information and bandit feedback within this framework, using game-theoretic arguments and adaptive weighting methods.
Forget adversarial attacks – learning can be efficient even with counterexamples, as long as they're chosen symmetrically based on the difference between your model and the truth.
Modern machine learning systems, such as generative models and recommendation systems, often evolve through a cycle of deployment, user interaction, and periodic model updates. This differs from standard supervised learning frameworks, which focus on loss or regret minimization over a fixed sequence of prediction tasks. Motivated by this setting, we revisit the classical model of learning from equivalence queries, introduced by Angluin (1988). In this model, a learner repeatedly proposes hypotheses and, when a deployed hypothesis is inadequate, receives a counterexample. Under fully adversarial counterexample generation, however, the model can be overly pessimistic. In addition, most prior work assumes a \emph{full-information} setting, where the learner also observes the correct label of the counterexample, an assumption that is not always natural. We address these issues by restricting the environment to a broad class of less adversarial counterexample generators, which we call \emph{symmetric}. Informally, such generators choose counterexamples based only on the symmetric difference between the hypothesis and the target. This class captures natural mechanisms such as random counterexamples (Angluin and Dohrn, 2017; Bhatia, 2021; Chase, Freitag, and Reyzin, 2024), as well as generators that return the simplest counterexample according to a prescribed complexity measure. Within this framework, we study learning from equivalence queries under both full-information and bandit feedback. We obtain tight bounds on the number of learning rounds in both settings and highlight directions for future work. Our analysis combines a game-theoretic view of symmetric adversaries with adaptive weighting methods and minimax arguments.