Search papers, labs, and topics across Lattice.
The paper introduces CINDI, an unsupervised probabilistic framework based on conditional normalizing flows, to simultaneously detect and impute anomalies in multivariate time series data, specifically targeting power grid data. CINDI models the conditional likelihood of the data to identify and replace low-probability segments with statistically consistent samples, unifying anomaly detection and imputation. Experiments on real-world power grid loss data from a Norwegian operator demonstrate CINDI's robust performance and scalability compared to disjoint anomaly detection and imputation baselines.
Forget disjoint anomaly detection and imputation pipelines: CINDI uses conditional normalizing flows to jointly model and restore corrupted time series data with state-of-the-art results.
Real-world multivariate time series, particularly in critical infrastructure such as electrical power grids, are often corrupted by noise and anomalies that degrade the performance of downstream tasks. Standard data cleaning approaches often rely on disjoint strategies, which involve detecting errors with one model and imputing them with another. Such approaches can fail to capture the full joint distribution of the data and ignore prediction uncertainty. This work introduces Conditional Imputation and Noisy Data Integrity (CINDI), an unsupervised probabilistic framework designed to restore data integrity in complex time series. Unlike fragmented approaches, CINDI unifies anomaly detection and imputation into a single end-to-end system built on conditional normalizing flows. By modeling the exact conditional likelihood of the data, the framework identifies low-probability segments and iteratively samples statistically consistent replacements. This allows CINDI to efficiently reuse learned information while preserving the underlying physical and statistical properties of the system. We evaluate the framework using real-world grid loss data from a Norwegian power distribution operator, though the methodology is designed to generalize to any multivariate time series domain. The results demonstrate that CINDI yields robust performance compared to competitive baselines, offering a scalable solution for maintaining reliability in noisy environments.