Search papers, labs, and topics across Lattice.
This paper introduces a novel methodology for constructing confidence regions from SGD trajectories that is valid even when stochastic gradients have infinite variance. The approach leverages a joint weak convergence result for the Polyak-Ruppert averaged estimator and an empirical second-moment normalizer, leading to a self-normalized statistic. By using a subsampling calibration scheme, the method avoids explicit estimation of tail indices or stable-law parameters, resulting in confidence regions that are easy to implement and asymptotically valid.
Forget estimating tail indices: this new method delivers reliable confidence intervals for SGD, even with infinite variance gradients.
Stochastic gradient descent (SGD) is a foundational algorithm for large-scale statistical learning and stochastic optimization. However, statistical inference based on SGD iterates remains challenging when stochastic gradients have infinite variance, as the relevant limiting distributions depend on unknown nuisance parameters. In this paper, we develop an efficient, model-agnostic methodology for constructing confidence regions from SGD trajectories that applies in both finite- and infinite-variance regimes. The procedure is based on a joint weak convergence result for the Polyak-Ruppert averaged estimator and an empirical second-moment normalizer constructed from stochastic gradients along the SGD trajectory. This joint limit yields a self-normalized statistic in which the leading tail-dependent scaling terms cancel. We then use a subsampling calibration scheme to estimate the relevant critical values, avoiding explicit estimation of tail indices, slowly varying functions, or stable-law parameters. The resulting confidence regions are straightforward to implement and are asymptotically valid under both the finite- and infinite-second-moment regimes. Simulation studies show reliable coverage in various settings, supporting the proposed method as a practical tool for uncertainty quantification in stochastic optimization.