Search papers, labs, and topics across Lattice.
This paper addresses domain generalization in anti-causal settings where the outcome causes the covariates, enabling the use of unlabeled data. They propose regularizing the model's sensitivity to environment perturbations affecting the covariates, which can be estimated without labels. The authors demonstrate worst-case optimality guarantees and empirical performance on physical system and physiological signal datasets using methods that penalize sensitivity to variations in the mean and covariance of covariates.
Unlock domain generalization with unlabeled data by exploiting the structure of anti-causal relationships, where outcomes cause covariates.
The problem of domain generalization concerns learning predictive models that are robust to distribution shifts when deployed in new, previously unseen environments. Existing methods typically require labeled data from multiple training environments, limiting their applicability when labeled data are scarce. In this work, we study domain generalization in an anti-causal setting, where the outcome causes the observed covariates. Under this structure, environment perturbations that affect the covariates do not propagate to the outcome, which motivates regularizing the model's sensitivity to these perturbations. Crucially, estimating these perturbation directions does not require labels, enabling us to leverage unlabeled data from multiple environments. We propose two methods that penalize the model's sensitivity to variations in the mean and covariance of the covariates across environments, respectively, and prove that these methods have worst-case optimality guarantees under certain classes of environments. Finally, we demonstrate the empirical performance of our approach on a controlled physical system and a physiological signal dataset.