Search papers, labs, and topics across Lattice.
This paper introduces MAGE (Memory as Agent-Guided Exploration), a novel memory management system for long-horizon agents that addresses the limitations of existing retrieval-augmented generation (RAG) and memory systems by organizing memory as a hierarchical state tree. By actively managing execution states, MAGE enables agents to maintain coherent decision trajectories and isolate errors, which is crucial for tasks with interdependent decisions. Experimental results demonstrate that MAGE enhances task success rates by 7.8–20.4 percentage points while significantly reducing token consumption by 55.1%.
MAGE redefines memory management for long-horizon agents, achieving up to 20.4% higher task success rates while slashing token usage by over half.
LLM-based agents increasingly tackle long-horizon tasks with interdependent decisions, where each action reshapes future constraints and intermediate errors can cascade. Existing RAG and agent memory systems organize histories by semantic similarity, retrieving content-relevant entries at decision time. We argue that this design mismatches execution-state dependencies: it fragments decision trajectories and mixes valid and erroneous traces, hindering coherent state reconstruction and error isolation. We propose MAGE (Memory as Agent-Guided Exploration), an active execution-state manager that stores interactions in a hierarchical state tree. The agent derives its state from the active root-to-current path, combining subgoal summaries, recent traces, and hints from prior branches. Four coupled operations maintain the tree: Grow records new traces, Compress summarizes completed subgoals, Maintain validates summaries, and Revise restores a target boundary and resumes on a new branch. This design bounds context growth while preserving state integrity and isolating flawed segments from the active path. Experiments on MemoryArena show that MAGE improves the average task success rate by 7.8--20.4 pp over baselines, while reducing token consumption by 55.1%.