Mar 2, 2026arXiv:2603.01951

Accelerating Single-Pass SGD for Generalized Linear Prediction

AI Summary

This paper introduces a novel single-pass stochastic gradient descent (SGD) algorithm with momentum for generalized linear prediction in a streaming setting. The algorithm leverages a data-dependent proximal method to achieve dual-momentum acceleration, addressing an open question regarding the applicability of momentum in non-quadratic stochastic optimization. The derived excess risk bound demonstrates improved optimization error compared to standard SGD, while maintaining minimax optimal statistical error, thus showing momentum is more effective than variance reduction.

Key Contribution

Momentum *can* accelerate single-pass stochastic gradient descent for generalized linear prediction, resolving a long-standing open question and outperforming variance reduction techniques.

Abstract

We study generalized linear prediction under a streaming setting, where each iteration uses only one fresh data point for a gradient-level update. While momentum is well-established in deterministic optimization, a fundamental open question is whether it can accelerate such single-pass non-quadratic stochastic optimization. We propose the first algorithm that successfully incorporates momentum via a novel data-dependent proximal method, achieving dual-momentum acceleration. Our derived excess risk bound decomposes into three components: an improved optimization error, a minimax optimal statistical error, and a higher-order model-misspecification error. The proof handles mis-specification via a fine-grained stationary analysis of inner updates, while localizing statistical error through a two-phase outer-loop analysis. As a result, we resolve the open problem posed by Jain et al. [2018a] and demonstrate that momentum acceleration is more effective than variance reduction for generalized linear prediction in the streaming setting.

Natural Language Processing Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Accelerating Single-Pass SGD for Generalized Linear Prediction

Related Papers