Search papers, labs, and topics across Lattice.
This paper explores the feasibility of adding persistent memory to frozen encoder-decoder LLMs by training small adapters to write and read from a continuous latent space memory bank. Six architectural methods, varying injection points and write mechanisms, are implemented and evaluated on the LoCoMo dataset. Results show that with sufficient memory capacity (10x), all trained adapters demonstrate positive memory recall, enabling conversational learning without updating the frozen backbone.
Frozen LLMs can learn to remember things across conversations, even with limited resources, by training adapters to read and write to a continuous latent space memory bank.
Frozen encoder--decoder language models are stateless: the latent representation is discarded after every forward pass, so no information persists across sessions. This paper presents a \textbf{proof-of-concept pilot study} showing that persistent memory in the \emph{continuous latent space} of a frozen LLM is feasible -- even under severe resource constraints (a single frozen Flan-T5-XL backbone, small trainable adapters, a single dataset). We implement six architectural methods spanning three injection points and four write mechanisms; unlike text-level memory systems, every write and read is a differentiable operation on dense vectors. After training only the adapter, the memory bank continues to accumulate at inference time without gradients, enabling \emph{conversational learning}. Under a forgetting-curve evaluation on LoCoMo at two capacity scales (1$\times$ and 10$\times$), the stateless baseline scores exactly zero; at 10$\times$ all six trained adapters produce positive memory-recall curves; at 1$\times$ three methods collapse, revealing capacity as a critical design parameter. Because the memory bank is a compact numerical array, it can be scaled to arbitrarily large capacity without altering the backbone. We argue that full end-to-end training with larger models, larger data, and orders-of-magnitude larger memory will yield substantially stronger results; this pilot study establishes the feasibility baseline and design-space taxonomy that such efforts require.