Search papers, labs, and topics across Lattice.
The paper frames retrieval-augmented generation (RAG) design as an architecture search problem, highlighting the challenge of tuning numerous RAG hyperparameters. To address this, they introduce RAISE, a framework and benchmark for RAG hyperparameter optimization, implementing 13 search algorithms across seven datasets. Their results demonstrate that optimization performance varies significantly across tasks, suggesting that no single optimization strategy is universally superior.
RAG pipeline performance hinges so heavily on the task that optimization strategies successful in one setting can completely fail in another.
Retrieval-augmented generation (RAG) systems expose numerous design choices spanning query rewriting, chunking, retrieval depth, reranking, and context compression. In practice, these choices are often configured through heuristics, hindering systematic evaluation and reproducibility across settings. We argue that this challenge is best formulated as RAG architecture search. To support controlled and reproducible study of this problem, we introduce the RAG Intelligence Search Engine (RAISE), a comprehensive framework and benchmark for RAG hyperparameter optimization, which evaluates optimization methods for RAG pipelines under standardized search spaces and budgets. RAISE implements 13 search algorithms and evaluates them across seven public text and multimodal datasets using three random seeds. Our experiments show that optimization performance is highly task-dependent: methods that perform strongly on one dataset may not generalize consistently across others, cautioning against interpreting aggregate rankings as evidence of universally superior strategies. RAISE provides a common experimental substrate for fair, reproducible, and systematic research on RAG hyperparameter optimization.