DAMOIndependent researcherRUCTencent AIApr 14, 2026arXiv:2604.12668

OFA-Diffusion Compression: Compressing Diffusion Model in One-Shot Manner

Haoyang Jiang, Mingyang Yi, Xiuyu Li, Lanqing Hu, Junxian Cai, Qingbin Liu, Ju Fan

AI Summary

This paper introduces OFA-Diffusion Compression, a once-for-all (OFA) training framework for diffusion models that enables the extraction of subnetworks with varying computational costs without retraining. They address the slow optimization of existing OFA methods by restricting the candidate subnetworks to a set of parameter sizes and allocating channels based on importance. Experiments demonstrate that the method generates compressed diffusion models with reduced training overhead and satisfactory performance across different sizes.

Key Contribution

Stop retraining your diffusion models for every device: OFA-Diffusion lets you extract the right-sized model in a single training run.

Abstract

The Diffusion Probabilistic Model (DPM) achieves remarkable performance in image generation, while its increasing parameter size and computational overhead hinder its deployment in practical applications. To improve this, the existing literature focuses on obtaining a smaller model with a fixed architecture through model compression. However, in practice, DPMs usually need to be deployed on various devices with different resource constraints, which leads to multiple compression processes, incurring significant overhead for repeated training. To obviate this, we propose a once-for-all (OFA) compression framework for DPMs that yields different subnetworks with various computations in a one-shot training manner. The existing OFA framework typically involves massive subnetworks with different parameter sizes, while such a huge candidate space slows the optimization. Thus, we propose to restrict the candidate subnetworks with a certain set of parameter sizes, where each size corresponds to a specific subnetwork. Specifically, to construct each subnetwork with a given size, we gradually allocate the maintained channels by their importance. Furthermore, we propose a reweighting strategy to balance the optimization process of different subnetworks. Experimental results show that our approach can produce compressed DPMs for various sizes with significantly lower training overhead while achieving satisfactory performance.

Computer Vision Inference & Quantization Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

OFA-Diffusion Compression: Compressing Diffusion Model in One-Shot Manner

Related Papers