Feb 23, 2026arXiv:2602.19782

Addressing Instrument-Outcome Confounding in Mendelian Randomization through Representation Learning

AI Summary

The paper addresses instrument-outcome confounding in Mendelian Randomization (MR) by learning latent, exogenous components of genetic instruments using representation learning. This tackles violations of the core MR assumption of independence between instruments and unobserved confounders, which can arise from population stratification or assortative mating. The approach leverages cross-environment invariance in multi-environment data to identify these latent instruments, and its effectiveness is demonstrated through simulations and semi-synthetic experiments using the All of Us Research Hub data.

Key Contribution

Multi-environment data can rescue Mendelian Randomization from confounding by learning latent, exogenous genetic instruments.

Abstract

Mendelian Randomization (MR) is a prominent observational epidemiological research method designed to address unobserved confounding when estimating causal effects. However, core assumptions -- particularly the independence between instruments and unobserved confounders -- are often violated due to population stratification or assortative mating. Leveraging the increasing availability of multi-environment data, we propose a representation learning framework that exploits cross-environment invariance to recover latent exogenous components of genetic instruments. We provide theoretical guarantees for identifying these latent instruments under various mixing mechanisms and demonstrate the effectiveness of our approach through simulations and semi-synthetic experiments using data from the All of Us Research Hub.

Natural Language Processing Scientific Discovery & Drug Design

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Addressing Instrument-Outcome Confounding in Mendelian Randomization through Representation Learning

Related Papers