Search papers, labs, and topics across Lattice.
This paper formalizes the connection between diffusion models and scale-space theory, demonstrating that highly noisy diffusion states contain limited information compared to downsampled images. To improve efficiency, they introduce Scale Space Diffusion, a family of diffusion models using generalized linear degradations, specifically downsampling. They also propose Flexi-UNet, a UNet variant that selectively performs resolution-preserving and resolution-increasing denoising, optimizing resource usage.
Why waste compute on high-resolution noise when a blurry thumbnail contains the same information?
Diffusion models degrade images through noise, and reversing this process reveals an information hierarchy across timesteps. Scale-space theory exhibits a similar hierarchy via low-pass filtering. We formalize this connection and show that highly noisy diffusion states contain no more information than small, downsampled images - raising the question of why they must be processed at full resolution. To address this, we fuse scale spaces into the diffusion process by formulating a family of diffusion models with generalized linear degradations and practical implementations. Using downsampling as the degradation yields our proposed Scale Space Diffusion. To support Scale Space Diffusion, we introduce Flexi-UNet, a UNet variant that performs resolution-preserving and resolution-increasing denoising using only the necessary parts of the network. We evaluate our framework on CelebA and ImageNet and analyze its scaling behavior across resolutions and network depths. Our project website ( https://prateksha.github.io/projects/scale-space-diffusion/ ) is available publicly.