CSIROGriffithUMacauUTSMar 17, 2026arXiv:2603.16405

Poisoning the Pixels: Revisiting Backdoor Attacks on Semantic Segmentation

Guangsheng Zhang, Huan Tian, L. Zhang, Leo Zhang, Tianqing Zhu, Ming Ding, Wanlei Zhou, Bo Liu

AI Summary

This paper introduces BADSEG, a unified framework for backdoor attacks on semantic segmentation models, exploring four coarse-grained and two fine-grained attack vectors. BADSEG optimizes trigger designs and label manipulation to maximize attack success while maintaining utility on clean samples. Experiments show BADSEG achieves high attack effectiveness across diverse architectures, including transformers and SAM, while current defenses fail to reliably mitigate these attacks.

Key Contribution

Semantic segmentation models, even recent transformer-based architectures like SAM, are surprisingly vulnerable to new backdoor attacks that current defenses can't reliably stop.

Abstract

Semantic segmentation models are widely deployed in safety-critical applications such as autonomous driving, yet their vulnerability to backdoor attacks remains largely underexplored. Prior segmentation backdoor studies transfer threat settings from existing image classification tasks, focusing primarily on object-to-background mis-segmentation. In this work, we revisit the threats by systematically examining backdoor attacks tailored to semantic segmentation. We identify four coarse-grained attack vectors (Object-to-Object, Object-to-Background, Background-to-Object, and Background-to-Background attacks), as well as two fine-grained vectors (Instance-Level and Conditional attacks). To formalize these attacks, we introduce BADSEG, a unified framework that optimizes trigger designs and applies label manipulation strategies to maximize attack performance while preserving victim model utility. Extensive experiments across diverse segmentation architectures on benchmark datasets demonstrate that BADSEG achieves high attack effectiveness with minimal impact on clean samples. We further evaluate six representative defenses and find that they fail to reliably mitigate our attacks, revealing critical gaps in current defenses. Finally, we demonstrate that these vulnerabilities persist in recent emerging architectures, including transformer-based networks and the Segment Anything Model (SAM), thereby compromising their security. Our work reveals previously overlooked security vulnerabilities in semantic segmentation, and motivates the development of defenses tailored to segmentation-specific threat models.

Computer Vision Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References62

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Poisoning the Pixels: Revisiting Backdoor Attacks on Semantic Segmentation

Related Papers