Search papers, labs, and topics across Lattice.
The paper introduces Non-linear Rank Adaptation (NoRA), a weight-level parallel adapter that uses SiLU gating and structural dropout to overcome the "linear ceiling" limitation of LoRA in parameter-efficient fine-tuning. NoRA achieves superior spectral efficiency by expanding the manifold and activating the dormant tail of the singular value spectrum, outperforming LoRA with significantly lower rank on SlimOrca and MathInstruct benchmarks. SVD analysis confirms that NoRA prevents the rank collapse observed in linear low-rank adaptation methods.
Forget scaling LoRA rank: NoRA unlocks better performance at 1/8th the rank by injecting non-linearities that expand the adaptation manifold.
Low-Rank Adaptation (LoRA) dominates parameter-efficient fine-tuning (PEFT). However, it faces a critical ``linear ceiling''in complex reasoning tasks: simply increasing the rank yields diminishing returns due to intrinsic linear constraints. We introduce NoRA (Non-linear Rank Adaptation), a weight-level parallel adapter that injects SiLU gating and structural dropout to induce manifold expansion. On the SlimOrca benchmark, NoRA breaks this linear barrier: NoRA remarkably at rank 64 (PPL 3.89) outperforms LoRA at rank 512 (PPL 3.90), demonstrating superior spectral efficiency. This advantage generalizes to mathematical reasoning, where NoRA achieves a perplexity of 1.97 on MathInstruct, significantly surpassing LoRA's saturation point of 2.07. Mechanism analysis via Singular Value Decomposition (SVD) confirms that NoRA activates the dormant tail of the singular value spectrum, effectively preventing the rank collapse observed in linear methods.