Search papers, labs, and topics across Lattice.
This paper revisits the performance of OmniAnomaly, a popular deep learning model for multivariate time series anomaly detection (MTSAD), and compares it against a PCA-based linear baseline on the Server Machine Dataset (SMD). By employing identical thresholding and evaluation procedures across 100 runs, the study reveals that PCA achieves comparable, and sometimes superior, performance to OmniAnomaly, particularly without point adjustment. The findings challenge the perceived superiority of complex deep learning architectures in MTSAD and emphasize the importance of standardized evaluation methodologies.
Deep learning's dominance in time series anomaly detection may be overstated: a carefully evaluated PCA baseline rivals the performance of the widely-used OmniAnomaly.
Deep learning models have become the dominant approach for multivariate time series anomaly detection (MTSAD), often reporting substantial performance improvements over classical statistical methods. However, these gains are frequently evaluated under heterogeneous thresholding strategies and evaluation protocols, making fair comparisons difficult. This work revisits OmniAnomaly, a widely used stochastic recurrent model for MTSAD, and systematically compares it with a simple linear baseline based on Principal Component Analysis (PCA) on the Server Machine Dataset (SMD). Both methods are evaluated under identical thresholding and evaluation procedures, with experiments repeated across 100 runs for each of the 28 machines in the dataset. Performance is evaluated using Precision, Recall and F1-score at point-level, with and without point-adjustment, and under different aggregation strategies across machines and runs, with the corresponding standard deviations also reported. The results show large variability across machines and show that PCA can achieve performance comparable to OmniAnomaly, and even outperform it when point-adjustment is not applied. These findings question the added value of more complex architectures under current benchmarking practices and highlight the critical role of evaluation methodology in MTSAD research.