Search papers, labs, and topics across Lattice.
This paper investigates how to combine supervised fine-tuning (SFT) and retrieval-augmented generation to improve LLM agent generalization to unseen tasks. They establish a strong SFT baseline using LoRA and analyze key design choices for experience retrieval, including storage, querying, and trajectory selection. By integrating experience retrieval into the fine-tuning process, the proposed pipeline significantly boosts generalization performance on unseen tasks compared to SFT or retrieval alone.
Retrieval-augmented LLM agents can learn to learn from experience, achieving significantly better generalization on unseen tasks by combining the strengths of fine-tuning and in-context retrieval.
While large language models (LLMs) have advanced the development of general-purpose agents, achieving robust generalization to unseen tasks remains a significant challenge. Current approaches typically rely on either fine-tuning or training-free memory-augmented generation using retrieved experience; yet both have limitations: fine-tuning often fails to extrapolate to new tasks, while experience retrieval often underperforms compared to supervised baselines. In this work, we propose to combine these approaches and systematically study how to train retrieval-augmented LLM agents to effectively leverage retrieved trajectories in-context. First, we establish a robust supervised fine-tuning (SFT) recipe using LoRA that outperforms several state-of-the-art agent training pipelines. Second, we provide a detailed analysis of key design choices for experience retrieval, identifying optimal strategies for storage, querying, and trajectory selection. Finally, we propose a pipeline that integrates experience retrieval into the fine-tuning process. Our results demonstrate that this combined approach significantly improves generalization to unseen tasks, providing a scalable and effective framework for building agents that learn to learn from experience.