Search papers, labs, and topics across Lattice.
This paper introduces ProactAgent, a lifelong learning framework that proactively retrieves information from a structured experience base to improve agent performance. The framework uses Experience-Enhanced Online Evolution (ExpOnEvo) to continually improve through policy updates and memory refinement, organizing historical interactions into typed repositories. Proactive Reinforcement Learning-based Retrieval (ProactRL) is then used to learn when and what to retrieve by modeling retrieval as an explicit policy action, leading to improved performance and reduced retrieval overhead in SciWorld, AlfWorld, and StuLife environments.
Stop passively waiting for retrieval cues – ProactAgent proactively asks for information from its memory and skills, leading to significant gains in lifelong learning performance.
Online lifelong learning enables agents to accumulate experience across interactions and continually improve on long-horizon tasks. However, existing methods typically treat retrieval from past experience as a passive operation, triggering it only at task initialization or after completing a step. Consequently, agents often fail to identify knowledge gaps during interaction and proactively retrieve the most useful experience for the current decision. To address this limitation, we present ProactAgent, an experience-driven lifelong learning framework for proactive retrieval over a structured experience base. We first introduce Experience-Enhanced Online Evolution (ExpOnEvo), which enables continual improvement through both policy updates and memory refinement. The experience base organizes historical interactions into typed repositories, including factual memory, episodic memory, and behavioral skills, so that retrieval can provide both relevant evidence and actionable guidance. On top of this, we propose Proactive Reinforcement Learning-based Retrieval (ProactRL), which models retrieval as an explicit policy action and learns when and what to retrieve via paired-branch process rewards. By comparing continuations from identical interaction prefixes with and without retrieval, ProactRL provides step-level supervision for retrieval decisions, encouraging retrieval only when it leads to better task outcomes or higher efficiency. Experiments on SciWorld, AlfWorld, and StuLife show that ProactAgent consistently improves lifelong agent performance, achieving success rates of 73.50\% on SciWorld and 71.28\% on AlfWorld while substantially reducing retrieval overhead, and attains performance competitive with proprietary models on StuLife.