May 30, 2026arXiv:2606.00590

Critic-R: Improving Agentic Search using Instruction-tuned Retrievers with Natural Language Introspective Feedback

Md Takrim Ul Alam, Alireza Salemi, Hamed Zamani

AI Summary

This paper introduces Critic-R, a novel framework designed to enhance agentic search systems by integrating a critic model that evaluates the reasoning process of the agent in relation to the retrieved context. By employing two mechanisms—an inference-time query refinement loop (Critic-R-Zero) and an optimization approach for retrieval models (Critic-Embed)—the framework effectively closes the feedback loop between the reasoning agent and the retrieval model. Evaluations on multiple QA datasets demonstrate that Critic-R leads to significant improvements in both retrieval quality and answer accuracy, addressing the challenges of optimizing retrievers without extensive manual annotations.

Key Contribution

Critic-R transforms agentic search by enabling retrieval models to learn from their own reasoning failures, significantly boosting answer accuracy without the need for gold-standard annotations.

Abstract

Agentic search systems iteratively interact with retrieval models to answer complex queries. Despite substantial progress, optimizing retrievers for agentic search remains challenging, often requiring heavy co-training or gold-standard annotations that limit real-world applicability. We propose Critic-R, a framework that explicitly closes the feedback loop between the reasoning agent and the retrieval model during both inference and training. Critic-R introduces a critic model that evaluates the agent's introspective reasoning trace after consuming retrieved evidence to determine whether the retrieved context sufficiently supports the next reasoning step. Critic-R has two complementary mechanisms: Critic-R-Zero, an inference-time query refinement loop that iteratively rewrites queries and retrieval instructions, and Critic-Embed, an optimization approach for retrieval models that leverages successful and failed refinement trajectories as automatic supervision without requiring manual relevance annotation. We evaluate Critic-R on HotpotQA, 2WikiMultihopQA, MuSiQue, and Bamboogle. Results show that Critic-R significantly improves both retrieval quality and downstream answer accuracy.

Recommendation & Information Retrieval RLHF & Preference Learning Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References28

Year2026

VenueN/A

Related Papers

Finding related papers...