Search papers, labs, and topics across Lattice.
The paper introduces Task-Aligned Retrieval (TAG), a novel retrieval framework that moves beyond semantic similarity by selecting context based on applicability to the task. TAG transforms documents into condition-action rules, uses LLMs to identify applicable rules for a given input, and generates outputs conditioned on the selected actions. Experiments across NPOV rewriting, code generation, and NBA reasoning show TAG outperforms standard RAG, especially in high-mismatch settings, while drastically reducing the amount of retrieved context.
Semantic similarity is a poor proxy for retrieval when task success hinges on applying the right rules, so ditch it and retrieve based on applicability.
Retrieval-augmented generation (RAG) ranks passages by semantic similarity to the input, implicitly assuming that semantic similarity is a reliable indication of applicability in downstream tasks. This assumption breaks down when task success depends not on topical relevance but on applying the correct rules, constraints, or procedural guidance. In such settings, the most useful context may be the rule triggered by the input rather than the most semantically similar passage. We propose Task-Aligned Retrieval (TAG), a retrieval framework that replaces similarity-based retrieval with applicability-based rule selection. TAG transforms source documents into traceable condition-action rules, identifies which rules apply to a given input through pairwise LLM judgments, and generates the output conditioned only on the selected actions. We empirically observe that across Wikipedia NPOV rewriting, HumanEval with PEP~8 compliance, and NBA transaction reasoning on RuleArena, TAG consistently outperforms standard RAG, with the largest gains in high-mismatch settings (up to 12.2\%) while reducing retrieved context by up to 93\%. These results suggest that, in rule- and instruction-governed tasks, retrieval should optimize for applicability rather than for semantic similarity alone.