Search papers, labs, and topics across Lattice.
The authors introduce MedVAR, an autoregressive foundation model for medical image generation that uses next-scale prediction for efficient and scalable synthesis. MedVAR generates images in a coarse-to-fine manner, producing multi-scale representations useful for downstream tasks. They curated a dataset of 440,000 CT and MRI images across six anatomical regions and demonstrate state-of-the-art generative performance in terms of fidelity, diversity, and scalability.
MedVAR leapfrogs existing methods by achieving state-of-the-art medical image generation with an autoregressive architecture designed for both speed and scalability.
Medical image generation is pivotal in applications like data augmentation for low-resource clinical tasks and privacy-preserving data sharing. However, developing a scalable generative backbone for medical imaging requires architectural efficiency, sufficient multi-organ data, and principled evaluation, yet current approaches leave these aspects unresolved. Therefore, we introduce MedVAR, the first autoregressive-based foundation model that adopts the next-scale prediction paradigm to enable fast and scale-up-friendly medical image synthesis. MedVAR generates images in a coarse-to-fine manner and produces structured multi-scale representations suitable for downstream use. To support hierarchical generation, we curate a harmonized dataset of around 440,000 CT and MRI images spanning six anatomical regions. Comprehensive experiments across fidelity, diversity, and scalability show that MedVAR achieves state-of-the-art generative performance and offers a promising architectural direction for future medical generative foundation models.