RadboudTU DelftUniversity of BergenUniversity of ZagrebMar 10, 2026arXiv:2603.09772

Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors

Gorka Abad, Ermes Franch, Stefanos Koffas, Stjepan Picek

AI Summary

This paper demonstrates that backdoors in neural networks can be activated by "alternative triggers" – patterns perceptually distinct from the original training trigger. They theoretically prove and empirically verify the existence of these alternative triggers, showing they exploit a latent backdoor feature-space direction even when the original trigger is neutralized. A feature-guided attack is developed to jointly optimize target prediction and directional alignment, effectively activating the backdoor using these alternative triggers.

Key Contribution

Backdoor defenses focused on removing training triggers are fundamentally flawed, as alternative, perceptually distinct triggers can reliably activate the same backdoor via a latent feature-space direction.

Abstract

Current backdoor defenses assume that neutralizing a known trigger removes the backdoor. We show this trigger-centric view is incomplete: \emph{alternative triggers}, patterns perceptually distinct from training triggers, reliably activate the same backdoor. We estimate the alternative trigger backdoor direction in feature space by contrasting clean and triggered representations, and then develop a feature-guided attack that jointly optimizes target prediction and directional alignment. First, we theoretically prove that alternative triggers exist and are an inevitable consequence of backdoor training. Then, we verify this empirically. Additionally, defenses that remove training triggers often leave backdoors intact, and alternative triggers can exploit the latent backdoor feature-space. Our findings motivate defenses targeting backdoor directions in representation space rather than input-space triggers.

Interpretability & Mechanistic Interp Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors

Related Papers