NUSFudanHKUKAUSTTongjiUMXiaohongshuZJUδ University of CaliforniaMay 24, 2026arXiv:2605.25002

MemMark: State-Evolution Attribution Watermarking for Agent Long-Term Memory Systems

Haobo Zhang, Xutao Mao, Guangyuan Dong, Ziwei Li, Xuanbo Su, Kaijie Chen, Jing Yang, Zheng Lin

AI Summary

The paper introduces MemMark, a novel watermarking scheme for memory-augmented agents that embeds attribution signals into latent memory-write decisions. MemMark uses keyed, distribution-preserving selection among admissible memory candidates and records cryptographic commitments to ensure attribution depends on reproducible backend behavior. Experiments on A-Mem and Graphiti with LoCoMo demonstrate that MemMark preserves memory utility (99.6% F1 retention) while enabling robust, snapshot-only attribution, even under various memory-lifecycle attacks.

Key Contribution

Watermarking agent memories is now possible without performance degradation or reliance on logs, enabling snapshot-only attribution even after memory migration or leakage.

Abstract

Memory-backed agents need provenance that can survive leaked or migrated snapshots, where logs, visible outputs, and trusted metadata may be absent. We propose MemMark, a state-evolution attribution watermark that embeds an owner-controlled signal into latent memory-write decisions. At each internal LLM call, MemMark samples among admissible candidates using keyed, distribution-preserving selection, and records cryptographic commitments with signed session anchors and reveal evidence. This makes attribution depend on reproducible backend behavior rather than mutable provenance fields. Across A-Mem and Graphiti on LoCoMo, with three LLM backbones, MemMark preserves memory utility: Overall F1 retains 99.6% of the unwatermarked baseline, while BLEU-1 changes by +0.2%. It also provides usable carrier capacity, with 1.16, 1.14, and 1.26 bits of mean entropy for update-target, link-target, and semantic-realization decisions. In the snapshot-only R3 setting, MemMark recovers the full 40-bit payload from final snapshots, while wrong-key verification remains near chance. Under nine memory-lifecycle attacks, verification distinguishes tampering, evidence deletion, and partial payload recovery. These results show that robust snapshot-only attribution is feasible for long-term agent memory without surviving traces, trusted metadata, or utility-degrading.

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

MemMark: State-Evolution Attribution Watermarking for Agent Long-Term Memory Systems

Related Papers