Search papers, labs, and topics across Lattice.
This paper introduces the Contrastive Augmented Transformer (CAT) framework, designed to enhance metal surface defect detection by addressing challenges such as limited annotated data and the identification of subtle multi-scale defects. By integrating a hierarchical Swin Transformer backbone with a redesigned feature pyramid network and employing a domain-specific droplet augmentation algorithm, CAT achieves remarkable robustness and generalization across diverse scenarios. Experimental results show CAT attaining a pixel-level AUROC of 99.54% on the KolektorSDD2 dataset, significantly surpassing existing methods and demonstrating its applicability for industrial use.
CAT achieves an unprecedented pixel-level AUROC of 99.54%, setting a new benchmark for metal surface defect detection in industrial applications.
Metal surface defect detection is critical for maintaining product quality in industrial manufacturing. However, it faces significant challenges, including limited annotated data, difficulty in identifying subtle multi-scale defects, and poor generalization across diverse scenarios. To address these issues, this paper proposes a novel Contrastive Augmented Transformer (CAT) framework for robust defect detection. CAT employs a hierarchical Swin Transformer backbone and redesigns the feature pyramid network to effectively fuse low-level textures with high-level semantics, enabling precise modeling of subtle and multi-scale defect patterns. To enhance robustness under real-world noise conditions, we propose a domain-specific droplet augmentation algorithm. Furthermore, we incorporate a hard negative mining strategy into the contrastive loss to strengthen the model's discrimination ability in ambiguous defect regions. Experimental results on the KolektorSDD2 dataset demonstrate that CAT achieves a pixel-level AUROC of 99.54%, outperforming existing methods. In addition, CAT exhibits superior generalization and robustness on three unseen datasets, including KSDD1, MTD for tile defects, and MSDD for rail surface defects, demonstrating its potential for wide-scale industrial deployment.