Feb 23, 2026arXiv:2602.19733

Understanding the Curse of Unrolling

Sheheryar Mehmood, Florian Knoll, Peter Ochs

AI Summary

This paper analyzes the "curse of unrolling" in algorithm unrolling, where derivative iterates initially diverge from the true Jacobian despite asymptotic correctness. The authors provide a non-asymptotic analysis identifying algorithmic factors governing this divergence and demonstrate that truncating early iterations of the derivative computation mitigates the curse and reduces memory. They further show that warm-starting in bilevel optimization implicitly truncates the unrolling, providing a practical solution.

Key Contribution

Algorithm unrolling's "curse" – initial Jacobian divergence – can be tamed by simply truncating early derivative iterations or using warm-starting in bilevel optimization.

Abstract

Algorithm unrolling is ubiquitous in machine learning, particularly in hyperparameter optimization and meta-learning, where Jacobians of solution mappings are computed by differentiating through iterative algorithms. Although unrolling is known to yield asymptotically correct Jacobians under suitable conditions, recent work has shown that the derivative iterates may initially diverge from the true Jacobian, a phenomenon known as the curse of unrolling. In this work, we provide a non-asymptotic analysis that explains the origin of this behavior and identifies the algorithmic factors that govern it. We show that truncating early iterations of the derivative computation mitigates the curse while simultaneously reducing memory requirements. Finally, we demonstrate that warm-starting in bilevel optimization naturally induces an implicit form of truncation, providing a practical remedy. Our theoretical findings are supported by numerical experiments on representative examples.

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Understanding the Curse of Unrolling

Related Papers