East China University of Science and TechnologyFudanShanghai University of Electric PowerApr 15, 2026arXiv:2604.13777

From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models

Wenxuan Li, Zhenfei Zhang, Mi Zhang, Geng Hong, Mi Wen, Xiaoyu You, Min Yang

AI Summary

The paper introduces MAGE, a corpus-free unlearning framework for LLMs that minimizes reliance on user-provided forget sets by using only a lightweight user anchor. MAGE probes the LLM to construct a weighted local memory graph representing target-related memorization and then synthesizes scoped supervision for unlearning. Experiments on TOFU and RWKU benchmarks show MAGE achieves unlearning performance comparable to methods using external references, while maintaining overall utility and enhancing auditability.

Key Contribution

Forget user-provided forget sets: MAGE lets you unlearn LLM memorization with just a lightweight "anchor," achieving performance on par with corpus-dependent methods.

Abstract

Large language models (LLMs) may memorize sensitive or copyrighted content, raising significant privacy and legal concerns. While machine unlearning has emerged as a potential remedy, prevailing paradigms rely on user-provided forget sets, making unlearning requests difficult to audit and exposing systems to secondary leakage and malicious abuse. We propose MAGE, a Memory-grAph Guided Erasure framework for user-minimized, corpus-free unlearning. Given only a lightweight user anchor that identifies a target entity, MAGE probes the target LLM to recover target-related memorization, organizes it into a weighted local memory graph, and synthesizes scoped supervision for unlearning. MAGE is model-agnostic, can be plugged into standard unlearning methods, and requires no access to the original training corpus. Experiments on two benchmarks, TOFU and RWKU, demonstrate that MAGE's self-generated supervision achieves effective unlearning performance comparable to supervision generated with external reference, while preserving overall utility. These results support a practical and auditable unlearning workflow driven by minimal anchors rather than user-supplied forget corpora.

Constitutional AI & AI Ethics Natural Language Processing Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models

Related Papers