Search papers, labs, and topics across Lattice.
The paper introduces TriFusion-SR, a wavelet-guided conditional diffusion framework for simultaneously fusing tri-modal medical images (MRI, CT, PET/SPECT) and performing super-resolution. It uses 2D Discrete Wavelet Transform to decompose features into frequency bands, enabling frequency-aware crossmodal interaction via a Rectified Wavelet Features (RWF) strategy and Adaptive Spatial-Frequency Fusion (ASFF) module. Experiments show TriFusion-SR achieves state-of-the-art performance with significant PSNR improvements and reduced RMSE and LPIPS.
Achieve state-of-the-art medical image fusion and super-resolution by jointly processing tri-modal inputs with a wavelet-guided diffusion model that explicitly handles frequency imbalances.
Multimodal medical image fusion facilitates comprehensive diagnosis by aggregating complementary structural and functional information, but its effectiveness is limited by resolution degradation and modality discrepancies. Existing approaches typically perform image fusion and super-resolution (SR) in separate stages, leading to artifacts and degraded perceptual quality. These limitations are further amplified in tri-modal settings that combine anatomical modalities (e.g., MRI, CT) with functional scans (e.g., PET, SPECT) due to pronounced frequency domain imbalances. We propose TriFusionSR, a wavelet-guided conditional diffusion framework for joint tri-modal fusion and SR. The framework explicitly decomposes multimodal features into frequency bands using the 2D Discrete Wavelet Transform, enabling frequency-aware crossmodal interaction. We further introduce a Rectified Wavelet Features (RWF) strategy for latent coefficient calibration, followed by an Adaptive Spatial-Frequency Fusion (ASFF) module with gated channel-spatial attention to enable structure-driven multimodal refinement. Extensive experiments demonstrate state-of-the-art performance, achieving 4.8-12.4% PSNR improvement and substantial reductions in RMSE and LPIPS across multiple upsampling scales.