Search papers, labs, and topics across Lattice.
The paper introduces MAESIL, a novel masked autoencoder framework for self-supervised learning on 3D medical images that addresses the limitations of existing methods that treat 3D volumes as independent 2D slices. MAESIL uses a "superpatch" approach, partitioning the volume into 3D chunks and employing a dual-masking strategy within a 3D masked autoencoder to better capture spatial relationships. Experiments on three CT datasets demonstrate that MAESIL significantly improves reconstruction metrics like PSNR and SSIM compared to AE, VAE, and VQ-VAE, establishing it as a strong pre-training method.
Medical imaging AI can now leverage a self-supervised pre-training method that understands 3D context, boosting reconstruction quality beyond what's possible with 2D-centric approaches.
Training deep learning models for three-dimensional (3D) medical imaging, such as Computed Tomography (CT), is fundamentally challenged by the scarcity of labeled data. While pre-training on natural images is common, it results in a significant domain shift, limiting performance. Self-Supervised Learning (SSL) on unlabeled medical data has emerged as a powerful solution, but prominent frameworks often fail to exploit the inherent 3D nature of CT scans. These methods typically process 3D scans as a collection of independent 2D slices, an approach that fundamentally discards critical axial coherence and the 3D structural context. To address this limitation, we propose the autoencoder for enhanced self-supervised medical image learning(MAESIL), a novel self-supervised learning framework designed to capture 3D structural information efficiently. The core innovation is the ‘superpatch,’ a 3D chunk-based input unit that balances 3D context preservation with computational efficiency. Our framework partitions the volume into superpatches and employs a 3D masked autoencoder strategy with a dualmasking strategy to learn comprehensive spatial representations. We validated our approach on three diverse large-scale public CT datasets. Our experimental results show that MAESIL demonstrates significant improvements over existing methods such as AE, VAE and VQ-VAE in key reconstruction metrics such as PSNR and SSIM. This establishes MAESIL as a robust and practical pre-training solution for 3D medical imaging tasks.