AI LaboratoryNTUUW-MadisonVirginia TechWaterlooZJUJun 4, 2026arXiv:2606.06054

Beyond Similarity: Trustworthy Memory Search for Personal AI Agents

Jiawen Zhang, Kejia Chen, Jiachen Ma, Yangfan Hu, Lipeng He, Yechao Zhang, Jian Liu, Xiaohu Yang, Tianwei Zhang, Ruoxi Jia

AI Summary

This paper investigates the limitations of existing memory retrieval methods in personal AI agents, which often rely on semantic similarity, leading to potential trust issues and vulnerabilities such as cross-domain leakage and memory-induced jailbreaks. By evaluating various memory frameworks and their susceptibility to these threats, the authors reveal that long-term memory functions as a critical control channel rather than just a utility layer. They introduce MemGate, a lightweight memory plug-in that enhances trustworthiness by applying task-conditioned admission to memory representations, effectively mitigating risks while maintaining the utility of long-term memory across diverse frameworks and LLMs.

Key Contribution

Trustworthy memory search can significantly reduce vulnerabilities in personal AI agents without compromising their long-term memory utility.

Abstract

Personal AI agents increasingly rely on long-term memory to provide persistent personalization across sessions. However, existing memory pipelines are largely driven by semantic similarity: memory data close to the current query is retrieved and injected into the model context. This creates a critical trustworthiness gap, since a semantically related memory may still be contextually inappropriate, leading to threats such as cross-domain leakage, sycophancy, tool-call drift, or memory-induced jailbreaks. In this paper, we study memory search as a trust boundary in personal AI agents. We evaluate representative agentic memory frameworks, including A-Mem, Mem0, and MemOS, together with OpenClaw, a real-world personal-agent environment with persistent state and tool-use capability. Our results show that long-term memory is not merely a utility layer, but a durable control channel that can reshape how agents interpret tasks and execute actions, leaving them highly susceptible to the aforementioned threats. To mitigate these vulnerabilities, we propose MemGate, a lightweight and deployable memory plug-in for trustworthy memory search, with only 9M parameters and a 35.1MB footprint. MemGate is inserted between the vector memory store and the backbone LLM, requiring no LLM modification, memory-database rewriting, or inference-time LLM judge. It applies a query-conditioned neural gate to candidate memory representations, turning raw similarity search into task-conditioned memory admission. Across multiple mainstream memory frameworks, real-world agent settings, and diverse LLM backbones, MemGate reduces memory-induced threats while preserving long-term memory utility.

Constitutional AI & AI Ethics Recommendation & Information Retrieval Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...