Search papers, labs, and topics across Lattice.
The paper introduces Knowledge Capsules, a novel method for incorporating external knowledge into LLMs by representing normalized relational knowledge as structured nonparametric memory units. These capsules are constructed from document corpora using a frozen base model and integrated via an External Key Value Injection (KVI) framework that allows direct participation in the model's attention computation. Experiments on QA benchmarks demonstrate that Knowledge Capsules outperform RAG and GraphRAG, especially in long-context and multi-hop reasoning scenarios, without requiring parameter updates.
Forget RAG's indirect knowledge injection – Knowledge Capsules let external knowledge directly influence LLM attention, boosting performance and stability in complex reasoning tasks.
Large language models (LLMs) encode knowledge in parametric weights, making it costly to update or extend without retraining. Retrieval-augmented generation (RAG) mitigates this limitation by appending retrieved text to the input, but operates purely through context expansion, where external knowledge competes as tokens within the attention mechanism. As a result, its influence is indirect and often unstable, particularly in long context and multi hop reasoning scenarios. We propose Knowledge Capsules, structured nonparametric memory units that represent normalized relational knowledge and can be constructed directly from document corpora using a frozen base model. Instead of injecting knowledge as text, we introduce an External Key Value Injection (KVI) framework that compiles capsules into attention-compatible key value representations, enabling external knowledge to directly participate in the model's attention computation. By shifting knowledge integration from context-level augmentation to memory level interaction, the proposed framework consistently outperforms RAG and GraphRAG across multiple QA benchmarks, with improved stability and accuracy in long context and multi hop reasoning, while requiring no parameter updates.