Search papers, labs, and topics across Lattice.
KohakuRAG is introduced as a hierarchical RAG framework that addresses challenges in high-precision citation by preserving document structure with a four-level tree representation and bottom-up embedding aggregation. It enhances retrieval coverage using an LLM-powered query planner with cross-query reranking and stabilizes answers via ensemble inference with abstention-aware voting. The framework achieved first place on the WattBot 2025 Challenge, demonstrating the effectiveness of its hierarchical approach and ensemble methods.
A hierarchical RAG framework with ensemble inference and LLM-powered query planning crushes the WattBot 2025 Challenge, showing that carefully structured retrieval and answer stabilization are key to high-precision question answering.
Retrieval-augmented generation (RAG) systems that answer questions from document collections face compounding difficulties when high-precision citations are required: flat chunking strategies sacrifice document structure, single-query formulations miss relevant passages through vocabulary mismatch, and single-pass inference produces stochastic answers that vary in both content and citation selection. We present KohakuRAG, a hierarchical RAG framework that preserves document structure through a four-level tree representation (document $\rightarrow$ section $\rightarrow$ paragraph $\rightarrow$ sentence) with bottom-up embedding aggregation, improves retrieval coverage through an LLM-powered query planner with cross-query reranking, and stabilizes answers through ensemble inference with abstention-aware voting. We evaluate on the WattBot 2025 Challenge, a benchmark requiring systems to answer technical questions from 32 documents with $\pm$0.1% numeric tolerance and exact source attribution. KohakuRAG achieves first place on both public and private leaderboards (final score 0.861), as the only team to maintain the top position across both evaluation partitions. Ablation studies reveal that prompt ordering (+80% relative), retry mechanisms (+69%), and ensemble voting with blank filtering (+1.2pp) each contribute substantially, while hierarchical dense retrieval alone matches hybrid sparse-dense approaches (BM25 adds only +3.1pp). We release KohakuRAG as open-source software at https://github.com/KohakuBlueleaf/KohakuRAG.