Search papers, labs, and topics across Lattice.
This paper introduces a neuroscience-grounded memory architecture for LLMs designed to address the limitations of fixed context windows and improve long-term interaction. The architecture incorporates valence-based memory organization, dual-process retrieval mechanisms, and active encoding driven by curiosity and feedback. By mimicking human cognitive processes, the proposed system aims to create LLMs that exhibit more efficient and context-aware interactions over time, converging towards faster, System 1-like processing.
LLMs can achieve human-like efficiency in long-term interactions by structuring memory around emotional valence, prioritizing automatic retrieval, and actively encoding information based on curiosity and feedback.
Large language models lack persistent, structured memory for long-term interaction and context-sensitive retrieval. Expanding context windows does not solve this: recent evidence shows that context length alone degrades reasoning by up to 85% - even with perfect retrieval. We propose a bio-inspired memory framework grounded in complementary learning systems theory, cognitive behavioral therapy's belief hierarchy, dual-process cognition, and fuzzy-trace theory, organized around three principles: (1) Memory has valence, not just content - pre-computed emotional-associative summaries (valence vectors) organized in an emergent belief hierarchy inspired by Beck's cognitive model enable instant orientation before deliberation; (2) Retrieval defaults to System 1 with System 2 escalation - automatic spreading activation and passive priming as default, with deliberate retrieval only when needed, and graded epistemic states that address hallucination structurally; and (3) Encoding is active, present, and feedback-dependent - a thalamic gateway tags and routes information between stores, while the executive forms gists through curiosity-driven investigation, not passive exposure. Seven functional properties specify what any implementation must satisfy. Over time, the system converges toward System 1 processing - the computational analog of clinical expertise - producing interactions that become cheaper, not more expensive, with experience.