Apr 20, 2026arXiv:2604.17998

Causally-Constrained Probabilistic Forecasting for Time-Series Anomaly Detection

Pooyan Khosravinia, João Gama, Bruno Veloso

AI Summary

This study introduces a Causally Guided Transformer (CGT) model for anomaly detection in multivariate time series, integrating a time-lagged causal graph to enhance causal interpretation and root-cause localization. By employing a hard parent mask for predictions and a shadow auxiliary path to manage correlational noise, the model effectively identifies anomalies while maintaining high interpretability. Experimental results demonstrate state-of-the-art performance with F1-scores of 96.19% on the ASD benchmark and 95.32% on the SMD benchmark, highlighting the advantages of causal structural priors in improving detection robustness and attribution quality.

Key Contribution

Causal structural priors can significantly enhance both the robustness and interpretability of anomaly detection in complex multivariate time series.

Abstract

Anomaly detection in multivariate time series is a central challenge in industrial monitoring, as failures frequently arise from complex temporal dynamics and cross-sensor interactions. While recent deep learning models, including graph neural networks and Transformers, have demonstrated strong empirical performance, most approaches remain primarily correlational and offer limited support for causal interpretation and root-cause localization. This study introduces a causally-constrained probabilistic forecasting framework which is a Causally Guided Transformer (CGT) model for multivariate time-series anomaly detection, integrating an explicit time-lagged causal graph prior with deep sequence modeling. For each target variable, a dedicated forecasting block employs a hard parent mask derived from causal discovery to restrict the main prediction pathway to graph-supported causes, while a latent Gaussian head captures predictive uncertainty. To leverage residual correlational information without compromising the causal representation, a shadow auxiliary path with stop-gradient isolation and a safety-gated blending mechanism is incorporated to suppress non-causal contributions when reliability is low. Anomalies are identified using negative log-likelihood scores with adaptive streaming thresholding, and root-cause variables are determined through per-dimension probabilistic attribution and counterfactual clamping. Experiments on the ASD and SMD benchmarks indicate that the proposed method achieves state-of-the-art detection performance, with F1-scores of 96.19% on ASD and 95.32% on SMD, and enhances variable-level attribution quality. These findings suggest that causal structural priors can improve both robustness and interpretability in detecting deep anomalies in multivariate sensor systems.

Interpretability & Mechanistic Interp

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Causally-Constrained Probabilistic Forecasting for Time-Series Anomaly Detection

Related Papers