Apr 30, 2026arXiv:2604.28118

DEFault++: Automated Fault Detection, Categorization, and Diagnosis for Transformer Architectures

Sigma Jahan, Sigma Jahan, Saurabhsingh Rajput, Saurabh Singh Rajput, Tushar Sharma, Tushar Sharma, Mohammad Masudur Rahman

AI Summary

DEFault++ is introduced as a hierarchical learning-based diagnostic technique for transformer models, designed to detect, categorize, and diagnose faults within attention mechanisms and other internal components. It uses a Fault Propagation Graph (FPG) derived from the transformer architecture and combines prototype matching with supervised contrastive learning to produce interpretable diagnoses. Evaluated on DEFault-bench, a benchmark of 3,739 labeled instances generated via mutation testing, DEFault++ achieves high AUROC and Macro-F1 scores and significantly improves developer accuracy in choosing correct repair actions.

Key Contribution

Pinpointing the root cause of transformer failures just got a whole lot easier: DEFault++ can detect, categorize, and diagnose faults with high accuracy, even down to specific mechanisms.

Abstract

Transformer models are widely deployed in critical AI applications, yet faults in their attention mechanisms, projections, and other internal components often degrade behavior silently without raising runtime errors. Existing fault diagnosis techniques often target generic deep neural networks and cannot identify which transformer component is responsible for an observed symptom. In this article, we present DEFault++, a hierarchical learning-based diagnostic technique that operates at three level of abstraction: it detects whether a fault is present, classifies it into one of 12 transformer-specific fault categories (covering both attention-internal mechanisms and surrounding architectural components), and identifies the underlying root cause from up to 45 mechanisms. To facilitate both training and evaluation, we construct DEFault-bench, a benchmark of 3,739 labeled instances obtained through systematic mutation testing. These instances are created across seven transformer models and nine downstream tasks using DEForm, a transformer-specific mutation technique we developed for this purpose. DEFault++ measures runtime behavior at the level of individual transformer components. It organizes these measurements through a Fault Propagation Graph (FPG) derived from the transformer architecture. It then produces an interpretable diagnosis using prototype matching combined with supervised contrastive learning. On DEFault-bench, DEFault++ exceeds an AUROC of 0.96 for detection and a Macro-F1 of 0.85 for both categorization and root-cause diagnosis on encoder and decoder architectures. In a developer study with 21 practitioners, the accuracy of choosing correct repair actions increased from 57.1% without support to 83.3% when using DEFault++.

Architecture Design (Transformers, SSMs, MoE)Interpretability & Mechanistic Interp

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DEFault++: Automated Fault Detection, Categorization, and Diagnosis for Transformer Architectures

Related Papers