UHFeb 2, 2026arXiv:2602.02918

A Multi-scale Linear-time Encoder for Whole-Slide Image Analysis

J. Dwarampudi, Joshua Wong, Hien Van Nguyen, Tania Banerjee

AI Summary

The paper introduces Multi-scale Adaptive Recurrent Biomedical Linear-time Encoder (MARBLE), a novel multiple instance learning (MIL) framework for whole-slide image (WSI) analysis that leverages a purely Mamba-based architecture. MARBLE processes multiple magnification levels of WSIs in parallel and uses a linear-time state-space model to capture cross-scale dependencies, addressing the limitations of single-scale MIL methods and the quadratic complexity of transformer-based approaches. Experiments on five public datasets demonstrate that MARBLE achieves significant improvements in AUC, accuracy, and C-index compared to existing methods, establishing its efficacy for multi-scale WSI analysis.

Key Contribution

Ditch the quadratic attention bottleneck: MARBLE uses Mamba to achieve state-of-the-art results on whole-slide image analysis with linear complexity.

Abstract

We introduce Multi-scale Adaptive Recurrent Biomedical Linear-time Encoder (MARBLE), the first \textit{purely Mamba-based} multi-state multiple instance learning (MIL) framework for whole-slide image (WSI) analysis. MARBLE processes multiple magnification levels in parallel and integrates coarse-to-fine reasoning within a linear-time state-space model, efficiently capturing cross-scale dependencies with minimal parameter overhead. WSI analysis remains challenging due to gigapixel resolutions and hierarchical magnifications, while existing MIL methods typically operate at a single scale and transformer-based approaches suffer from quadratic attention costs. By coupling parallel multi-scale processing with linear-time sequence modeling, MARBLE provides a scalable and modular alternative to attention-based architectures. Experiments on five public datasets show improvements of up to \textbf{6.9\%} in AUC, \textbf{20.3\%} in accuracy, and \textbf{2.3\%} in C-index, establishing MARBLE as an efficient and generalizable framework for multi-scale WSI analysis.

Architecture Design (Transformers, SSMs, MoE)Multimodal Models

Citation Metrics

Citations0

Influential citations0

References17

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

A Multi-scale Linear-time Encoder for Whole-Slide Image Analysis

Related Papers