Search papers, labs, and topics across Lattice.
The paper introduces DIMAFx, a novel explainable multimodal framework for cancer survival prediction that learns disentangled representations from histopathology and transcriptomics data. DIMAFx achieves state-of-the-art survival prediction performance while improving representation disentanglement across multiple cancer cohorts. By combining disentangled representations with SHAP explanations, the framework identifies key multimodal interactions, revealing that modality-shared features related to tumor morphology and estrogen response are highly predictive in breast cancer, aligning with known biology.
Unlocking the black box of multimodal cancer survival prediction, DIMAFx reveals that shared features between histopathology and transcriptomics, like tumor morphology contextualized by estrogen response, are surprisingly predictive.
While multimodal survival prediction models are increasingly more accurate, their complexity often reduces interpretability, limiting insight into how different data sources influence predictions. To address this, we introduce DIMAFx, an explainable multimodal framework for cancer survival prediction that produces disentangled, interpretable modality-specific and modality-shared representations from histopathology whole-slide images and transcriptomics data. Across multiple cancer cohorts, DIMAFx achieves state-of-the-art performance and improved representation disentanglement. Leveraging its interpretable design and SHapley Additive exPlanations, DIMAFx systematically reveals key multimodal interactions and the biological information encoded in the disentangled representations. In breast cancer survival prediction, the most predictive features contain modality-shared information, including one capturing solid tumor morphology contextualized primarily by late estrogen response, where higher-grade morphology aligned with pathway upregulation and increased risk, consistent with known breast cancer biology. Key modality-specific features capture microenvironmental signals from interacting adipose and stromal morphologies. These results show that multimodal models can overcome the traditional trade-off between performance and explainability, supporting their application in precision medicine.