Mar 10, 2026arXiv:2603.09310

A Gaussian Comparison Theorem for Training Dynamics in Machine Learning

AI Summary

This paper establishes a non-asymptotic connection between the training dynamics of a class of algorithms trained on Gaussian mixture data and a simpler surrogate dynamical system using the Gordon comparison theorem. This connection rigorously validates dynamic mean-field (DMF) expressions in asymptotic scenarios and enables an iterative refinement scheme for improved accuracy in non-asymptotic settings. The theory is applied to analyze the training of a perceptron model with a generic first-order algorithm, revealing the emergence of fluctuation parameters in the non-asymptotic domain beyond DMF kernels.

Key Contribution

Gordon's comparison theorem bridges the gap between complex ML training dynamics and tractable surrogate systems, offering a path to more accurate non-asymptotic analysis.

Abstract

We study training algorithms with data following a Gaussian mixture model. For a specific family of such algorithms, we present a non-asymptotic result, connecting the evolution of the model to a surrogate dynamical system, which can be easier to analyze. The proof of our result is based on the celebrated Gordon comparison theorem. Using our theorem, we rigorously prove the validity of the dynamic mean-field (DMF) expressions in the asymptotic scenarios. Moreover, we suggest an iterative refinement scheme to obtain more accurate expressions in non-asymptotic scenarios. We specialize our theory to the analysis of training a perceptron model with a generic first-order (full-batch) algorithm and demonstrate that fluctuation parameters in a non-asymptotic domain emerge in addition to the DMF kernels.

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References37

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

A Gaussian Comparison Theorem for Training Dynamics in Machine Learning

Related Papers