Feb 16, 2026arXiv:2602.14656

An Embarrassingly Simple Way to Optimize Orthogonal Matrices at Scale

AI Summary

The paper introduces POGO, a computationally efficient algorithm for optimizing orthogonal matrices at scale, addressing the limitations of existing methods that are either slow or temporarily relax orthogonality constraints. POGO achieves this by revisiting and improving upon the Landing algorithm, enabling the use of adaptive optimizers while strictly enforcing orthogonality. Experimental results demonstrate that POGO significantly outperforms existing optimizers on challenging benchmarks, optimizing problems with thousands of orthogonal matrices in minutes while maintaining orthogonality.

Key Contribution

Forget slow, expensive orthogonal optimization – POGO slashes runtime from hours to minutes while *guaranteeing* orthogonality.

Abstract

Orthogonality constraints are ubiquitous in robust and probabilistic machine learning. Unfortunately, current optimizers are computationally expensive and do not scale to problems with hundreds or thousands of constraints. One notable exception is the Landing algorithm (Ablin et al., 2024) which, however comes at the expense of temporarily relaxing orthogonality. In this work, we revisit and improve on the ideas behind Landing, enabling the inclusion of modern adaptive optimizers while ensuring that orthogonal constraints are effectively met. Remarkably, these improvements come at little to no cost, and reduce the number of required hyperparemeters. Our algorithm POGO is fast and GPU-friendly, consisting of only 5 matrix products, and in practice maintains orthogonality at all times. On several challenging benchmarks, POGO greatly outperforms recent optimizers and shows it can optimize problems with thousands of orthogonal matrices in minutes while alternatives would take hours. As such, POGO sets a milestone to finally exploit orthogonality constraints in ML at scale. A PyTorch implementation of POGO is publicly available at https://github.com/adrianjav/pogo.

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

An Embarrassingly Simple Way to Optimize Orthogonal Matrices at Scale

Related Papers