Search papers, labs, and topics across Lattice.
This paper introduces Cordon, a transactional runtime system designed to enhance tool-using LLM agents by establishing a semantic transaction model that encapsulates task-level execution boundaries. By addressing the limitations of current isolated RPC interfaces, Cordon allows for better management of irreversible effects through staged validation, rollback capabilities, and comprehensive audit trails. Evaluation results indicate that Cordon significantly reduces irreversible-effect failures while maintaining task completion efficiency, highlighting its effectiveness in both adversarial and benign scenarios.
Cordon reveals that a transactional approach to LLM agent runtimes can drastically reduce irreversible failures while enhancing task integrity across complex workflows.
Tool-using LLM agents are shifting the unit of computation from explicit human-issued commands to model-driven tasks with stateful consequences. Yet today's agent runtimes still expose tools as isolated RPCs. This interface gives runtimes a convenient integration point, but it lacks a task-scoped execution boundary for commit, rollback, recovery, and audit across multi-step agent workflows. We argue that this mismatch calls for a runtime containment boundary rather than another per-call guardrail. This paper introduces Cordon, a transactional runtime system for staging and validating irreversible agent effects before commit. A semantic transaction is a task-level execution boundary that binds tool intents and runtime-tracked result lineage to reversible local state, staged external effects, delegated authority, and audit metadata. Cordon implements this abstraction with a transaction manager that tracks derived result objects, executes reversible mutations in shadow state, stages outward-facing actions in an effect outbox, and records recovery metadata. The runtime then validates the composed execution flow before it commits state or releases external effects. Our evaluation across adversarial and benign workflows shows that Cordon exposes cross-step violations missed by existing defenses. It also reduces irreversible-effect failures while preserving benign task completion with modest approval and latency overhead.