Mar 2, 2026arXiv:2603.01547

PathMoE: Interpretable Multimodal Interaction Experts for Pediatric Brain Tumor Classification

Jian Yu, Joakim Nguyen, Jinrui Fang, Awais Naeem, Zeyuan Cao, Sanjay Krishnan, Nicholas Konz, Tianlong Chen, Chandra Krishnan, Hairong Wang, Edward Castillo, Ying Ding, Ankita Shukla

AI Summary

The paper introduces PathMoE, a multimodal framework for pediatric brain tumor classification that integrates whole-slide images (WSI), pathology reports, and nuclei-level cell graphs using an interaction-aware mixture-of-experts architecture. PathMoE trains specialized experts to capture modality-specific information and interactions, using an input-dependent gating mechanism for dynamic weighting and interpretability. Experiments on internal and external datasets demonstrate that PathMoE significantly improves macro-F1 scores compared to image-only baselines, highlighting the importance of multimodal integration for accurate and interpretable predictions.

Key Contribution

PathMoE reveals the specific modality interactions driving individual predictions in pediatric brain tumor classification, offering crucial interpretability for rare tumor subtypes.

Abstract

Accurate classification of pediatric central nervous system tumors remains challenging due to histological complexity and limited training data. While pathology foundation models have advanced whole-slide image (WSI) analysis, they often fail to leverage the rich, complementary information found in clinical text and tissue microarchitecture. To this end, we propose PathMoE, an interpretable multimodal framework that integrates H\&E slides, pathology reports, and nuclei-level cell graphs via an interaction-aware mixture-of-experts architecture built on state-of-the-art foundation models for each modality. By training specialized experts to capture modality uniqueness, redundancy, and synergy, PathMoE employs an input-dependent gating mechanism that dynamically weights these interactions, providing sample-level interpretability. We evaluate our framework on two dataset-specific classification tasks on an internal pediatric brain tumor dataset (PBT) and external TCGA datasets. PathMoE improves macro-F1 from 0.762 to 0.799 (+0.037) on PBT when integrating WSI, text, and graph modalities; on TCGA, augmenting WSI with graph knowledge improves macro-F1 from 0.668 to 0.709 (+0.041). These results demonstrate significant performance gains over state-of-the-art image-only baselines while revealing the specific modality interactions driving individual predictions. This interpretability is particularly critical for rare tumor subtypes, where transparent model reasoning is essential for clinical trust and diagnostic validation.

Architecture Design (Transformers, SSMs, MoE)Interpretability & Mechanistic Interp Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

PathMoE: Interpretable Multimodal Interaction Experts for Pediatric Brain Tumor Classification

Related Papers