Search papers, labs, and topics across Lattice.
SPD-RAG is introduced as a hierarchical multi-agent framework for question answering across large document corpora, where each document is handled by a dedicated agent for focused retrieval. A coordinator agent then aggregates partial answers from document-level agents, and a token-bounded synthesis layer merges these into a final answer. Experiments on the LOONG benchmark demonstrate that SPD-RAG outperforms standard and agentic RAG approaches while significantly reducing API costs compared to full-context baselines.
By decomposing RAG along the document axis with specialized agents, SPD-RAG achieves state-of-the-art performance on multi-document QA while slashing API costs by over 60%.
Answering complex, real-world queries often requires synthesizing facts scattered across vast document corpora. In these settings, standard retrieval-augmented generation (RAG) pipelines suffer from incomplete evidence coverage, while long-context large language models (LLMs) struggle to reason reliably over massive inputs. We introduce SPD-RAG, a hierarchical multi-agent framework for exhaustive cross-document question answering that decomposes the problem along the document axis. Each document is processed by a dedicated document-level agent operating only on its own content, enabling focused retrieval, while a coordinator dispatches tasks to relevant agents and aggregates their partial answers. Agent outputs are synthesized by merging partial answers through a token-bounded synthesis layer (which supports recursive map-reduce for massive corpora). This document-level specialization with centralized fusion improves scalability and answer quality in heterogeneous multidocument settings while yielding a modular, extensible retrieval pipeline. On the LOONG benchmark (EMNLP 2024) for long-context multi-document QA, SPD-RAG achieves an Avg Score of 58.1 (GPT-5 evaluation), outperforming Normal RAG (33.0) and Agentic RAG (32.8) while using only 38% of the API cost of a full-context baseline (68.0).