May 28, 2026arXiv:2605.30260

How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

Ziwen Xu, Haiwen Hong, Linsong Yu, Benglei Cui, Longtao Huang, Hui Xue, Ningyu Zhang

AI Summary

This paper investigates the parametric memory capacity of LLMs when using LoRA for knowledge updates. They introduce a "Parametric Memory Law" that quantitatively links loss reduction to effective parameters and sequence length via a power law. Furthermore, they find a deterministic phase transition at the token level where a prediction probability > 0.5 guarantees verbatim recall, and use this insight to develop MemFT, a threshold-guided optimization strategy that improves memory fidelity and efficiency.

Key Contribution

LLMs exhibit a quantifiable "Parametric Memory Law" during LoRA finetuning, where loss reduction scales predictably with effective parameters and sequence length, revealing fundamental limits on how much they can memorize.

Abstract

Large Language Models (LLMs) must continuously learn and update knowledge to remain effective in dynamic real-world environments. While Low-Rank Adaptation (LoRA) is widely used for such memory updates, existing studies mainly rely on qualitative downstream evaluations, leaving the quantitative capacity limits and underlying dynamics of exact parametric memory largely unexplored. To bridge this gap, we employ LoRA as a controlled memory capacity probe within the latent space to systematically quantify exact parametric memory. We introduce the Parametric Memory Law, a robust power law linking loss reduction Delta L to effective parameters and sequence length. At the token level, fine-grained analysis reveals a deterministic phase transition, demonstrating that a prediction probability of p>0.5 constitutes a sufficient condition for verbatim recall under greedy decoding. Driven by these insights, we introduce MemFT, a threshold-guided optimization strategy that dynamically redistributes the training budget toward sub-threshold tokens. Empirical evaluations demonstrate that MemFT can enhance memory fidelity and efficiency. Code will be released at https://github.com/zjunlp/ParametricMemoryLaw.

Architecture Design (Transformers, SSMs, MoE)Scaling Laws & Emergent Abilities Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References41

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

Related Papers