Search papers, labs, and topics across Lattice.
This paper investigates catastrophic forgetting in low-rank decomposition-based parameter-efficient fine-tuning (PEFT) methods like LoRA when applied to sequential learning. Through empirical analysis, the authors demonstrate that the geometry and parameterization of the update subspace significantly impact the degree of forgetting. They find that tensor-based decompositions and structurally aligned parameterizations are more effective at mitigating forgetting compared to methods that restrict updates to small, shared matrix subspaces.
Tensor-based PEFT methods like LoRETTA can dramatically reduce catastrophic forgetting in sequential learning by capturing richer structural information within compact parameter budgets.
Parameter-efficient fine-tuning (PEFT) based on low-rank decomposition, such as LoRA, has become a standard for adapting large pretrained models. However, its behavior in sequential learning -- specifically regarding catastrophic forgetting -- remains insufficiently understood. In this work, we present an empirical study showing that forgetting is strongly influenced by the geometry and parameterization of the update subspace. While methods that restrict updates to small, shared matrix subspaces often suffer from task interference, tensor-based decompositions (e.g., LoRETTA) mitigate forgetting by capturing richer structural information within ultra-compact budgets, and structurally aligned parameterizations (e.g., WeGeFT) preserve pretrained representations. Our findings highlight update subspace design as a key factor in continual learning and offer practical guidance for selecting efficient adaptation strategies in sequential settings.