Search papers, labs, and topics across Lattice.
This paper investigates the discrepancy between language model and human memory capabilities by replicating classic psychology memory experiments on both. It finds that standard language models exhibit superior memory compared to humans, even when prompted to mimic human behavior. The authors then demonstrate that strategic prompting and a "compactor" mechanism can induce more human-like forgetting in LMs, leading to improved user simulation in an educational context.
LLMs remember too much to be good user simulators, but targeted prompting and a novel "compactor" can make them forget like humans do.
Language models are increasingly being deployed as user simulators, but their memory is far more reliable than that of real users. To measure this gap, we run a series of classic memory experiments from psychology on both humans and language models. Across tasks, we find that out-of-the-box language models exhibit better memory than humans, even when prompted to imitate human behavior. We then show that better prompting strategies and the use of a compactor can cause language models to forget content in a more human-like way. Using these methods, we show preliminary evidence that language models with human-like memory constraints can function as more effective user simulators in a downstream education task. Finally, we release human reference data and benchmarks to support future work on simulating human memory with language models.