ArizonaASUJun 18, 2026arXiv:2606.20529

LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents

Md Nayem Uddin, Amir Saeidi, Eduardo Blanco, Chitta Baral

AI Summary

This paper introduces LedgerAgent, a novel method for managing task states in policy-adherent tool-calling agents, which traditionally rely on implicit state management through prompts. By maintaining a separate ledger of observed task states, LedgerAgent enhances decision-making accuracy and ensures compliance with domain policies by checking state-dependent constraints before executing tool calls. The approach demonstrates significant improvements in average pass rates across multiple customer-service domains, particularly under stringent consistency metrics, highlighting its effectiveness over standard prompt-based methods.

Key Contribution

LedgerAgent's separate ledger for task states significantly boosts policy adherence and decision accuracy in tool-calling agents, outperforming traditional methods.

Abstract

Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifiers, constraints, and conditions observed through user interaction and tool calls. In standard agents, task states are not represented separately. Observations, tool returns, and policy instructions are placed in the prompt, leaving agents to reconstruct the relevant states from the prompt each time they decide what to do next. This design makes state management implicit, creating two common failure modes. An agent may retrieve the right facts but later ground its decision in stale, missing, or incorrect information; and a syntactically valid tool call may still violate a domain policy that depends on the current task state. We introduce LedgerAgent, an inference-time method for tool-calling agents that maintains observed task states in a separate ledger and renders the states into the prompt. The ledger is also used to check state-dependent policy constraints before environment-changing tool calls are executed, blocking policy violations. Across four customer-service domains and a mixed panel of open- and closed-weight models, LedgerAgent improves average passk over a standard prompt-based tool-calling approach, with the largest gains under stricter multi-trial consistency metrics.

RLHF & Preference Learning Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents

Related Papers