Search papers, labs, and topics across Lattice.
This paper evaluates existing machine unlearning methods under repeated unlearning scenarios and identifies two key failure modes: knowledge erosion (accuracy degradation on retained data) and forgetting reversal (re-emergence of forgotten data). To address these issues, they propose SAFER, a continual unlearning framework that combines representation stability for retained data with negative logit margin enforcement for forgotten data. Experiments demonstrate that SAFER effectively mitigates both knowledge erosion and forgetting reversal, maintaining stable performance across multiple unlearning phases.
Repeatedly unlearning data from a model causes it to gradually forget what it was supposed to remember and, surprisingly, re-learn what it already forgot.
As a means to balance the growth of the AI industry with the need for privacy protection, machine unlearning plays a crucial role in realizing the ``right to be forgotten''in artificial intelligence. This technique enables AI systems to remove the influence of specific data while preserving the rest of the learned knowledge. Although it has been actively studied, most existing unlearning methods assume that unlearning is performed only once. In this work, we evaluate existing unlearning algorithms in a more realistic scenario where unlearning is conducted repeatedly, and in this setting, we identify two critical phenomena: (1) Knowledge Erosion, where the accuracy on retain data progressively degrades over unlearning phases, and (2) Forgetting Reversal, where previously forgotten samples become recognizable again in later phases. To address these challenges, we propose SAFER (StAbility-preserving Forgetting with Effective Regularization), a continual unlearning framework that maintains representation stability for retain data while enforcing negative logit margins for forget data. Extensive experiments show that SAFER mitigates not only knowledge erosion but also forgetting reversal, achieving stable performance across multiple unlearning phases.