Search papers, labs, and topics across Lattice.
The paper introduces SpecMuon, a novel optimizer designed to address optimization challenges in physics-informed neural networks (PINNs) and neural operators arising from ill-conditioned gradients and stiffness. SpecMuon extends the Muon optimizer by incorporating a mode-wise relaxed scalar auxiliary variable (RSAV) mechanism, enabling adaptive step size regulation based on the spectral decomposition of gradients. Theoretical analysis demonstrates SpecMuon's energy dissipation, auxiliary variable properties, and global convergence, while empirical results show improved convergence speed and stability compared to Adam, AdamW, and Muon on benchmark problems.
By tuning step sizes based on the spectral properties of gradients, SpecMuon offers a more stable and faster alternative to Adam and Muon for training physics-informed neural networks.
Physics-informed neural networks and neural operators often suffer from severe optimization difficulties caused by ill-conditioned gradients, multi-scale spectral behavior, and stiffness induced by physical constraints. Recently, the Muon optimizer has shown promise by performing orthogonalized updates in the singular-vector basis of the gradient, thereby improving geometric conditioning. However, its unit-singular-value updates may lead to overly aggressive steps and lack explicit stability guarantees when applied to physics-informed learning. In this work, we propose SpecMuon, a spectral-aware optimizer that integrates Muon's orthogonalized geometry with a mode-wise relaxed scalar auxiliary variable (RSAV) mechanism. By decomposing matrix-valued gradients into singular modes and applying RSAV updates individually along dominant spectral directions, SpecMuon adaptively regulates step sizes according to the global loss energy while preserving Muon's scale-balancing properties. This formulation interprets optimization as a multi-mode gradient flow and enables principled control of stiff spectral components. We establish rigorous theoretical properties of SpecMuon, including a modified energy dissipation law, positivity and boundedness of auxiliary variables, and global convergence with a linear rate under the Polyak-Lojasiewicz condition. Numerical experiments on physics-informed neural networks, DeepONets, and fractional PINN-DeepONets demonstrate that SpecMuon achieves faster convergence and improved stability compared with Adam, AdamW, and the original Muon optimizer on benchmark problems such as the one-dimensional Burgers equation and fractional partial differential equations.