Search papers, labs, and topics across Lattice.
This paper introduces cost-aware Retrieval-Augmented Generation (RAG), addressing the limitations of traditional RAG systems that assume free access to external knowledge. By implementing access-cost tiers on the MS MARCO v2.1 dataset, the authors evaluate how budget constraints affect evidence selection for both general and domain-specific question answering. The results reveal that static evidence selection methods are ineffective, and while agentic cost-aware RAG shows potential for adaptive evidence acquisition, its performance is highly dependent on the model and task context.
Static evidence selection fails under budget constraints, revealing the need for adaptive strategies in retrieval-augmented systems.
Retrieval-Augmented Generation (RAG) typically assumes that external knowledge is free, but many high-quality sources are paywalled, licensed, restricted, or otherwise costly to access. We introduce cost-aware RAG, a setting where retrieved evidence is assigned access-cost tiers and systems must answer under an explicit evidence-access budget. We instantiate this setting by augmenting MS MARCO v2.1 with access-friction tiers and evaluate budgeted evidence selection across general-domain and domain-specific QA benchmarks. Our results show that static selection is brittle: no fixed selector uniformly dominates, and larger budgets do not reliably improve answer quality, even when costly evidence is domain-matched. We then study agentic cost-aware RAG, where an LLM decides when to retrieve, which tier to access, and when to stop. Agents show strong promise as adaptive evidence-acquisition controllers, but their behavior remains highly model- and task-dependent. These findings suggest that cost-aware evidence acquisition is a central challenge for the next generation of RAG systems. All code and data are available at https://github.com/Mignonmy/Cost-Aware.