Mar 16, 2026arXiv:2603.15335

Data Augmentation via Causal-Residual Bootstrapping

Mateusz Gajewski, Sophia Xiao, Bijan Mazaheri

AI Summary

This paper introduces a data augmentation technique called Causal-Residual Bootstrapping that leverages causal knowledge and the principle of independent mechanisms to generate new data points. The method involves permuting the residuals of models trained on marginal probability distributions, effectively incorporating information beyond Markov equivalence classes in settings with additive noise. Experiments demonstrate that predictive models trained on data augmented with this approach exhibit improved accuracy, supported by theoretical analysis in linear Gaussian settings.

Key Contribution

Causal-Residual Bootstrapping lets you inject more causal knowledge into your data augmentation pipeline than previous methods, leading to better model accuracy.

Abstract

Data augmentation integrates domain knowledge into a dataset by making domain-informed modifications to existing data points. For example, image data can be augmented by duplicating images in different tints or orientations, thereby incorporating the knowledge that images may vary in these dimensions. Recent work by Teshima and Sugiyama has explored the integration of causal knowledge (e.g, A causes B causes C) up to conditional independence equivalence. We suggest a related approach for settings with additive noise that can incorporate information beyond a Markov equivalence class. The approach, built on the principle of independent mechanisms, permutes the residuals of models built on marginal probability distributions. Predictive models built on our augmented data demonstrate improved accuracy, for which we provide theoretical backing in linear Gaussian settings.

Computer Vision Data Curation & Synthetic Data

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Data Augmentation via Causal-Residual Bootstrapping

Related Papers