Search papers, labs, and topics across Lattice.
This paper analyzes the generalization behavior of both discrete and continuous-time ResNets using a dynamical systems perspective, leveraging flow maps and Rademacher complexity. They derive generalization error bounds of order $O(1/\sqrt{S})$ with respect to the number of training samples $S$ for both types of ResNets. The derived bounds include a structure-dependent negative term, leading to depth-uniform and asymptotic generalization guarantees under weaker assumptions than previous work.
ResNets generalize better than you think: a new depth-uniform generalization bound reveals a structure-dependent term that improves asymptotic performance.
Deep neural networks (DNNs) have significantly advanced machine learning, with model depth playing a central role in their successes. The dynamical system modeling approach has recently emerged as a powerful framework, offering new mathematical insights into the structure and learning behavior of DNNs. In this work, we establish generalization error bounds for both discrete- and continuous-time residual networks (ResNets) by combining Rademacher complexity, flow maps of dynamical systems, and the convergence behavior of ResNets in the deep-layer limit. The resulting bounds are of order $O(1/\sqrt{S})$ with respect to the number of training samples $S$, and include a structure-dependent negative term, yielding depth-uniform and asymptotic generalization bounds under milder assumptions. These findings provide a unified understanding of generalization across both discrete- and continuous-time ResNets, helping to close the gap in both the order of sample complexity and assumptions between the discrete- and continuous-time settings.