UCSDMar 3, 2026arXiv:2603.02639

Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

Xinran Zheng, Tara Javidi, Behrouz Touri

AI Summary

This paper analyzes federated learning with delayed stochastic gradients, demonstrating that a pre-chosen diminishing step size achieves optimal SGD convergence rates for both nonconvex and strongly convex objectives, matching the performance of delay-adaptive step size methods. The analysis considers potentially biased and delayed stochastic gradient estimates transmitted from local agents to a central server. The key result is a theoretical proof that diminishing step sizes are sufficient to recover optimal SGD rates, simplifying implementation compared to adaptive schemes.

Key Contribution

Forget fancy adaptive schemes: simple diminishing step sizes are provably sufficient for optimal performance in federated learning with delayed gradients.

Abstract

We propose a general framework for distributed stochastic optimization under delayed gradient models. In this setting, $n$ local agents leverage their own data and computation to assist a central server in minimizing a global objective composed of agents' local cost functions. Each agent is allowed to transmit stochastic-potentially biased and delayed-estimates of its local gradient. While a prior work has advocated delay-adaptive step sizes for stochastic gradient descent (SGD) in the presence of delays, we demonstrate that a pre-chosen diminishing step size is sufficient and matches the performance of the adaptive scheme. Moreover, our analysis establishes that diminishing step sizes recover the optimal SGD rates for nonconvex and strongly convex objectives.

Distributed Systems & Hardware Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

Related Papers