Search papers, labs, and topics across Lattice.
This paper addresses the open problem of connecting the law of robustness (overparameterization for robust interpolation) with robust generalization (small robust training loss implies small robust test loss). The authors introduce a novel notion of robust generalization error and derive a lower bound on the expected Rademacher complexity of the induced robust loss class. Their analysis recovers the $Ω(n^{1/d})$ regime of Wu et al. (2023) and demonstrates that robust generalization does not fundamentally alter the required Lipschitz constant for smooth interpolation.
Robust generalization isn't as hard as you think: it only tweaks, rather than revolutionizes, the Lipschitz constant needed for smooth interpolation.
Bubeck and Sellke (2021) pose as an open problem the connection between the law of robustness and robust generalization. The law of robustness states that overparameterization is necessary for models to interpolate robustly; in particular, robust interpolation requires the learned function to be Lipschitz. Robust generalization asks whether small robust training loss implies small robust test loss. We resolve this problem by explicitly connecting the two for arbitrary data distributions. Specifically, we introduce a nontrivial notion of robust generalization error and convert it into a lower bound on the expected Rademacher complexity of the induced robust loss class. Our bounds recover the $Ω(n^{1/d})$ regime of Wu et al.\ (2023) and show that, up to constants, robust generalization does not change the order of the Lipschitz constant required for smooth interpolation. We conduct experiments to probe the predicted scaling with dataset size and model capacity, testing whether empirical behavior aligns more closely with the predictions of Bubeck and Sellke (2021) or Wu et al.\ (2023). For MNIST, we find that the lower-bound Lipschitz constant scales on the order predicted by Wu et al.\ (2023). Informally, to obtain low robust generalization error, the Lipschitz constant must lie in a range that we bound, and the allowable perturbation radius is linked to the Lipschitz scale.