Search papers, labs, and topics across Lattice.
This paper analyzes the bias introduced by discretizing continuous variables when estimating causal effect functionals that involve integration over conditional densities. It demonstrates that this coarsening bias is first-order in the bin width and occurs at the population level, independent of statistical estimation error. The authors propose a bias-reduced functional that evaluates the outcome regression at within-bin conditional means, achieving second-order approximation error and leading to improved estimation.
Discretizing continuous variables in causal inference introduces a surprisingly large bias, but a simple adjustment using within-bin conditional means can dramatically reduce it.
A class of causal effect functionals requires integration over conditional densities of continuous variables, as in mediation effects and nonparametric identification in causal graphical models. Estimating such densities and evaluating the resulting integrals can be statistically and computationally demanding. A common workaround is to discretize the variable and replace integrals with finite sums. Although convenient, discretization alters the population-level functional and can induce non-negligible approximation bias, even under correct identification. Under smoothness conditions, we show that this coarsening bias is first order in the bin width and arises at the level of the target functional, distinct from statistical estimation error. We propose a simple bias-reduced functional that evaluates the outcome regression at within-bin conditional means, eliminating the leading term and yielding a second-order approximation error. We derive plug-in and one-step estimators for the bias-reduced functional. Simulations demonstrate substantial bias reduction and near-nominal confidence interval coverage, even under coarse binning. Our results provide a simple framework for controlling the impact of variable discretization on parameter approximation and estimation.