Search papers, labs, and topics across Lattice.
CAFlow, a novel adaptive-depth single-step flow-matching framework, is introduced to address the computational challenges of super-resolution in gigapixel histopathology images. By performing flow matching in pixel-unshuffled rearranged space and employing a lightweight exit classifier, CAFlow adaptively routes image tiles to the shallowest network exit that maintains reconstruction quality, achieving significant compute savings. Results demonstrate that CAFlow attains comparable or superior performance to existing methods with substantially reduced computational cost, while also preserving clinically relevant structures for downstream tasks like nuclei segmentation.
Dramatically speed up histopathology super-resolution by adaptively routing image tiles through a flow-matching network, achieving near-lossless quality at a fraction of the compute.
In digital pathology, whole-slide images routinely exceed gigapixel resolution, making computationally intensive generative super-resolution (SR) impractical for routine deployment. We introduce CAFlow, an adaptive-depth single-step flow-matching framework that routes each image tile to the shallowest network exit that preserves reconstruction quality. CAFlow performs flow matching in pixel-unshuffled rearranged space, reducing spatial computation by 16x while enabling direct inference. We show that dedicating half of training to exact t=0 samples is essential for single-step quality (-1.5 dB without it). The backbone, FlowResNet (1.90M parameters), mixes convolution and window self-attention blocks across four early exits spanning 3.1 to 13.3 GFLOPs. A lightweight exit classifier (~6K parameters) achieves 33% compute savings at only 0.12 dB cost. On multi-organ histopathology x4 SR, adaptive routing achieves 31.72 dB PSNR versus 31.84 dB at full depth, while the shallowest exit exceeds bicubic by +1.9 dB at 2.8x less compute than SwinIR-light. The method generalizes to held-out colon tissue with minimal quality loss (-0.02 dB), and at x8 upscaling it outperforms all comparable-compute baselines while remaining competitive with the much larger SwinIR-Medium model. Downstream nuclei segmentation confirms preservation of clinically relevant structure. The model trains in under 5 hours on a single GPU, and adaptive routing can reduce whole-slide inference from minutes to seconds.