Search papers, labs, and topics across Lattice.
The paper introduces NESS, a continual learning method that leverages the null space of previous tasks, approximated using small singular values of input representations, to mitigate catastrophic forgetting. NESS directly enforces orthogonality in the weight space by constraining task-specific updates, parameterized via a low-rank adaptation (LoRA) formulation, to this estimated null space. Empirical results on benchmark datasets demonstrate that NESS achieves competitive performance, low forgetting, and stable accuracy across tasks, validating the effectiveness of exploiting small singular values for continual learning.
Forget gradient projections – NESS sidesteps catastrophic forgetting by directly exploiting the null space of previous tasks, identified via small singular values, to constrain weight updates.
Alleviating catastrophic forgetting while enabling further learning is a primary challenge in continual learning (CL). Orthogonal-based training methods have gained attention for their efficiency and strong theoretical properties, and many existing approaches enforce orthogonality through gradient projection. In this paper, we revisit orthogonality and exploit the fact that small singular values correspond to directions that are nearly orthogonal to the input space of previous tasks. Building on this principle, we introduce NESS (Null-space Estimated from Small Singular values), a CL method that applies orthogonality directly in the weight space rather than through gradient manipulation. Specifically, NESS constructs an approximate null space using the smallest singular values of each layer's input representation and parameterizes task-specific updates via a compact low-rank adaptation (LoRA-style) formulation constrained to this subspace. The subspace basis is fixed to preserve the null-space constraint, and only a single trainable matrix is learned for each task. This design ensures that the resulting updates remain approximately in the null space of previous inputs while enabling adaptation to new tasks. Our theoretical analysis and experiments on three benchmark datasets demonstrate competitive performance, low forgetting, and stable accuracy across tasks, highlighting the role of small singular values in continual learning. The code is available at https://github.com/pacman-ctm/NESS.