Amazon ScienceCambridgeCausalFoundationModels.ELLISFreiburgImperialMax PlanckOxfordFeb 16, 2026arXiv:2602.14972

Use What You Know: Causal Foundation Models with Partial Graphs

Arik Reuter, Anish Dhir, Cristiana Diaconu, Jake Robertson, Ole Ossen, Frank Hutter, Mark van der Wilk, Bernhard Schölkopf

AI Summary

This paper introduces a method for conditioning Causal Foundation Models (CFMs) on partial causal graph information, addressing the limitation that existing CFMs cannot incorporate domain knowledge. They propose injecting learnable biases into the attention mechanism of CFMs to effectively utilize both full and partial causal information. Experiments demonstrate that this conditioning allows a general-purpose CFM to match the performance of specialized models trained on specific causal structures.

Key Contribution

General-purpose Causal Foundation Models can now match the performance of specialized causal models by incorporating partial causal graph information via attention bias, unlocking a more unified approach to causal inference.

Abstract

Estimating causal quantities traditionally relies on bespoke estimators tailored to specific assumptions. Recently proposed Causal Foundation Models (CFMs) promise a more unified approach by amortising causal discovery and inference in a single step. However, in their current state, they do not allow for the incorporation of any domain knowledge, which can lead to suboptimal predictions. We bridge this gap by introducing methods to condition CFMs on causal information, such as the causal graph or more readily available ancestral information. When access to complete causal graph information is too strict a requirement, our approach also effectively leverages partial causal information. We systematically evaluate conditioning strategies and find that injecting learnable biases into the attention mechanism is the most effective method to utilise full and partial causal information. Our experiments show that this conditioning allows a general-purpose CFM to match the performance of specialised models trained on specific causal structures. Overall, our approach addresses a central hurdle on the path towards all-in-one causal foundation models: the capability to answer causal queries in a data-driven manner while effectively leveraging any amount of domain expertise.

Natural Language Processing Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Use What You Know: Causal Foundation Models with Partial Graphs

Related Papers