Search papers, labs, and topics across Lattice.
EfficientGraph-RAG introduces a structured approach to retrieval-augmented generation by explicitly managing retrieval state through a typed hierarchical state space (TAM), role-specialized agents for state updates and verification (MARS), and hierarchical storage with access control for reusable state (SMP). This framework achieves state-of-the-art results on LongBench retrieval tasks, matches strong agentic baselines on HotpotQA with significantly reduced token usage, and delivers competitive performance on DocVQA. Component analysis reveals that MARS primarily drives answer quality, TAM provides traversal state, and SMP enables corpus-dependent reuse through caching.
By explicitly structuring and managing retrieval state, EfficientGraph-RAG achieves state-of-the-art RAG performance while slashing large-model token usage by over 3x.
Retrieval-augmented generation (RAG) has become the standard way to ground large language models in external knowledge, but many systems still organize evidence as flat chunks and retrieve it through largely unstructured search. This weak structure becomes a bottleneck for complex retrieval: the system must decide where to search, how to move from coarse topics to entity-relation evidence, which evidence has been verified, and which intermediate artifacts can be reused. We define these intermediate variables as a retrieval state and study RAG as structured state management. EfficientGraph-RAG makes this state explicit through three coupled mechanisms: TAM defines a typed hierarchical state space over evidence, MARS updates and verifies the state through role-specialized agents, and SMP stores reusable state under hierarchy-aware access control. Using one shared framework configuration, EfficientGraph-RAG ranks first on the reported answer-quality metrics averaged over the three evaluated LongBench retrieval-style subsets, matches the strongest agentic baseline on HotpotQA EM while reducing large-model token usage by $3.51\times$, and provides a low-token DocVQA result among retrieval-organizing cross-modal methods. Component analysis shows role-specific mechanisms: MARS is the main answer-quality driver, TAM supplies the typed traversal state and Adaptive Routing signal, and SMP enables corpus-dependent reuse, with cross-query cache hit rates ranging from 3.77% to 23.18%.