CASDeakinDigital China GroupHarbin Engineering UniversityJun 1, 2026arXiv:2606.02322

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

Ran Liu, Min Yu, Mingqi Liu, Jianguo Jiang, Gang Li, Rongsheng Li, Ning Li, Zhen Xu, Weiqing Huang

AI Summary

This paper introduces AdvCL, a novel approach that repurposes adversarial perturbations as geometric control signals to enhance continual learning in large language models. By integrating three modules—Intra-Smooth, Proto-Clip, and Inter-Align—AdvCL effectively mitigates forgetting and improves transfer across tasks while maintaining robustness against adversarial attacks. Experimental results demonstrate that the combination of these modules leads to significant performance improvements, suggesting a new paradigm for stable continual adaptation in dynamic environments.

Key Contribution

Repurposing adversarial perturbations can significantly enhance continual learning, reducing forgetting and improving task transfer in large language models.

Abstract

In dynamic environments, large language models need to keep adapting to new tasks, but continual learning often suffers from forgetting, limited transfer, and vulnerability to adversarial perturbations. To address this, we present AdvCL, which repurposes adversarial perturbations as a geometric control signal for stable continual adaptation. AdvCL combines three plug-in modules: Intra-Smooth promotes local smoothness via small adversarial perturbations; Proto-Clip uses similarity clipping to prevent excessive alignment to current task prototype; and Inter-Align applies directional alignment toward previous task prototype to reduce representational gaps. Experiments show consistent gains in both standard performance and robustness, with lower forgetting and stronger transfer. We further analyze key mechanisms by quantifying the sensitivity of Intra-Smooth to perturbation settings and the effect of Inter-Align on task similarity and geometric distance. In summary, the modules provide complementary gains when combined, and each can also be integrated individually into diverse CL paradigms, including replay, regularization, and dynamic architectures, thereby offering a geometric control mechanism for continual learning.

Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

Related Papers