MBZUAIMohamedNational University of Science and TechnologyRussian Academy of SciencesZayed University of ArtificialApr 8, 2026arXiv:2604.07036

ReDAct: Uncertainty-Aware Deferral for LLM Agents

Dzianis Piatrashyn, Nikita Kotelevskii, Kirill Grishchenkov, Nikita Glazkov, Ivan Nasonov, Timothy Baldwin, Preslav Nakov, Roman Vashurin, Maxim Panov

AI Summary

The paper introduces ReDAct, a framework for LLM agents that strategically defers decisions from a small, cheap LLM to a larger, more reliable LLM based on the small model's predictive uncertainty. By calibrating a threshold for uncertainty, ReDAct minimizes the use of the expensive model while maintaining decision-making quality in sequential tasks. Experiments in ALFWorld and MiniGrid show that deferring only 15% of decisions can match the performance of using the large model exclusively, leading to significant cost savings.

Key Contribution

Deferring to a larger LLM only when a smaller LLM is uncertain can match the performance of the larger model alone, while slashing inference costs.

Abstract

Recently, LLM-based agents have become increasingly popular across many applications, including complex sequential decision-making problems. However, they inherit the tendency of LLMs to hallucinate, leading to incorrect decisions. In sequential settings, even a single mistake can irreversibly degrade the trajectory, making hallucinations an even bigger problem. Although larger LLMs hallucinate less, they incur a significantly higher per-token cost. In this paper, we address this tradeoff by proposing ReDAct (Reason-Defer-Act). In ReDAct, an agent is equipped with two LLMs: a small, cheap model used by default, and a large, more reliable but expensive model. When the predictive uncertainty of the small model exceeds a calibrated threshold, the decision is deferred to the large model. We evaluate our approach in text-based embodied environments such as ALFWorld and MiniGrid and show that deferring only about 15% of decisions to the large model can match the quality of using it exclusively, while significantly reducing inference costs.

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

ReDAct: Uncertainty-Aware Deferral for LLM Agents

Related Papers