Tsinghua AIGuangzhou City PolytechnicJilinJun 1, 2026arXiv:2606.01528

Joint Agent Memory and Exploration Learning via Novelty Signals

Shizuo Tian, Xiaohong Weng, Rui Kong, Yuxuan Chen, Guohong Liu, Yuebing Song, Jiacheng Liu, Yuchen Li, Ting Cao, Yunxin Liu, Yuanchun Li

AI Summary

This paper introduces the Joint Agent Memory and Exploration Learning (JAMEL) framework, which integrates memory and exploration policies in autonomous agents through novelty-driven interactions. By leveraging deterministic novelty signals, such as code coverage, JAMEL provides effective supervision for memory usage, enabling agents to differentiate between familiar and unexplored behaviors. Empirical results show that JAMEL not only outperforms open-weight baselines in exploration capabilities but also achieves exploration depth comparable to closed-source models while minimizing token consumption.

Key Contribution

Novelty-driven interaction enables agents to explore more effectively while using memory efficiently, outperforming traditional methods in open-ended environments.

Abstract

In open-ended environments, exploration is fundamental for autonomous agents, yet current language model agents struggle with this. Effective exploration requires memory, but retaining raw interaction histories is computationally expensive over long trajectories. While latent memory offers a solution to compress interaction histories, its training lacks reliable supervisory signals. We introduce Joint Agent Memory and Exploration Learning (JAMEL), a framework that trains agentic memory and exploration policy together through novelty-driven interaction. We observe that memory and exploration form a mutually dependent loop: sustained exploration requires memory to distinguish exhausted behaviors from unseen ones, while novelty-seeking interaction provides the supervision needed to make memory useful for future exploration. By utilizing deterministic and persistent novelty signals such as code coverage in the GUI domain, we provide natural, annotation-free supervision for the memory module. Empirical evaluations demonstrate that \ours successfully generalizes to unseen environments. Its exploration capability outperforms open-weight baselines and rivals the exploration depth of a closed-source model while reducing token consumption. Our code and model are open-sourced at https://github.com/MobileLLM/JAMEL.

Tool Use & Agents World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Joint Agent Memory and Exploration Learning via Novelty Signals

Related Papers