Feb 19, 2026arXiv:2602.17596

Asymptotic Smoothing of the Lipschitz Loss Landscape in Overparameterized One-Hidden-Layer ReLU Networks

AI Summary

This paper analyzes the loss landscape of overparameterized one-hidden-layer ReLU networks with $\ell_1$-regularization on the second layer, demonstrating that sublevel sets become increasingly connected as network width grows for convex $L$-Lipschitz losses. The authors prove an asymptotic upper bound on the energy gap between local and global minima, showing it vanishes with increasing width, thus flattening the loss landscape. Empirical validation using Dynamic String Sampling (DSS) on synthetic and real datasets confirms the theoretical findings, showing a reduction in energy gaps with wider networks.

Key Contribution

Overparameterized ReLU networks with Lipschitz losses have flatter loss landscapes than you thought, with energy gaps between local and global minima vanishing as network width grows.

Abstract

We study the topology of the loss landscape of one-hidden-layer ReLU networks under overparameterization. On the theory side, we (i) prove that for convex $L$-Lipschitz losses with an $\ell_1$-regularized second layer, every pair of models at the same loss level can be connected by a continuous path within an arbitrarily small loss increase $ε$ (extending a known result for the quadratic loss); (ii) obtain an asymptotic upper bound on the energy gap $ε$ between local and global minima that vanishes as the width $m$ grows, implying that the landscape flattens and sublevel sets become connected in the limit. Empirically, on a synthetic Moons dataset and on the Wisconsin Breast Cancer dataset, we measure pairwise energy gaps via Dynamic String Sampling (DSS) and find that wider networks exhibit smaller gaps; in particular, a permutation test on the maximum gap yields $p_{perm}=0$, indicating a clear reduction in the barrier height.

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Asymptotic Smoothing of the Lipschitz Loss Landscape in Overparameterized One-Hidden-Layer ReLU Networks

Related Papers