Search papers, labs, and topics across Lattice.
This paper explores using Large Language Models (LLMs) as imperfect experts within the Causal Assumption-based Argumentation (ABA) framework to improve causal discovery. The approach elicits semantic structural priors from variable names and descriptions using LLMs and integrates these priors with conditional-independence tests derived from observational data. Experiments on standard benchmarks and semantically grounded synthetic graphs demonstrate state-of-the-art performance, while also introducing an evaluation protocol to mitigate memorization bias.
LLMs, despite their imperfections, can be surprisingly effective at causal discovery when combined with constraint-based methods, outperforming traditional statistical approaches.
Causal discovery seeks to uncover causal relations from data, typically represented as causal graphs, and is essential for predicting the effects of interventions. While expert knowledge is required to construct principled causal graphs, many statistical methods have been proposed to leverage observational data with varying formal guarantees. Causal Assumption-based Argumentation (ABA) is a framework that uses symbolic reasoning to ensure correspondence between input constraints and output graphs, while offering a principled way to combine data and expertise. We explore the use of large language models (LLMs) as imperfect experts for Causal ABA, eliciting semantic structural priors from variable names and descriptions and integrating them with conditional-independence evidence. Experiments on standard benchmarks and semantically grounded synthetic graphs demonstrate state-of-the-art performance, and we additionally introduce an evaluation protocol to mitigate memorisation bias when assessing LLMs for causal discovery.