Feb 19, 2026arXiv:2602.17532

Systematic Evaluation of Single-Cell Foundation Model Interpretability Reveals Attention Captures Co-Expression Rather Than Unique Regulatory Signal

AI Summary

This paper introduces a systematic evaluation framework for mechanistic interpretability in single-cell foundation models, comprising 37 analyses and 153 statistical tests across multiple cell types and perturbation modalities. Applying this framework to scGPT and Geneformer, the authors find that while attention patterns encode structured biological information like protein-protein interactions and transcriptional regulation, they do not improve perturbation prediction compared to simple gene-level baselines. They also introduce Cell-State Stratified Interpretability (CSSI) to address attention scaling failures, improving GRN recovery.

Key Contribution

Attention mechanisms in single-cell foundation models capture co-expression patterns rather than unique regulatory signals, challenging assumptions about their mechanistic interpretability.

Abstract

We present a systematic evaluation framework - thirty-seven analyses, 153 statistical tests, four cell types, two perturbation modalities - for assessing mechanistic interpretability in single-cell foundation models. Applying this framework to scGPT and Geneformer, we find that attention patterns encode structured biological information with layer-specific organisation - protein-protein interactions in early layers, transcriptional regulation in late layers - but this structure provides no incremental value for perturbation prediction: trivial gene-level baselines outperform both attention and correlation edges (AUROC 0.81-0.88 versus 0.70), pairwise edge scores add zero predictive contribution, and causal ablation of regulatory heads produces no degradation. These findings generalise from K562 to RPE1 cells; the attention-correlation relationship is context-dependent, but gene-level dominance is universal. Cell-State Stratified Interpretability (CSSI) addresses an attention-specific scaling failure, improving GRN recovery up to 1.85x. The framework establishes reusable quality-control standards for the field.

Eval Frameworks & Benchmarks Interpretability & Mechanistic Interp Scientific Discovery & Drug Design

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Systematic Evaluation of Single-Cell Foundation Model Interpretability Reveals Attention Captures Co-Expression Rather Than Unique Regulatory Signal

Related Papers