Feb 17, 2026arXiv:2602.15971

B-DENSE: Branching For Dense Ensemble Network Learning

Cherish Puniani, Tushar Kumar, Arnav Bendre, Gaurav Kumar, Shree Singhi

AI Summary

The paper introduces B-DENSE, a distillation framework for diffusion models that addresses the limitations of sparse supervision in existing distillation techniques by leveraging multi-branch trajectory alignment. B-DENSE modifies the student architecture to output K-fold expanded channels, each branch corresponding to a discrete intermediate step in the teacher's trajectory, thereby enforcing dense intermediate trajectory alignment. Experiments demonstrate that B-DENSE enables the student model to learn the solution space from early training stages, resulting in improved image generation quality compared to baseline distillation frameworks.

Key Contribution

B-DENSE overcomes the limitations of sparse supervision in diffusion model distillation, achieving superior image generation by densely aligning the student's trajectory with the teacher's intermediate steps.

Abstract

Inspired by non-equilibrium thermodynamics, diffusion models have achieved state-of-the-art performance in generative modeling. However, their iterative sampling nature results in high inference latency. While recent distillation techniques accelerate sampling, they discard intermediate trajectory steps. This sparse supervision leads to a loss of structural information and introduces significant discretization errors. To mitigate this, we propose B-DENSE, a novel framework that leverages multi-branch trajectory alignment. We modify the student architecture to output $K$-fold expanded channels, where each subset corresponds to a specific branch representing a discrete intermediate step in the teacher's trajectory. By training these branches to simultaneously map to the entire sequence of the teacher's target timesteps, we enforce dense intermediate trajectory alignment. Consequently, the student model learns to navigate the solution space from the earliest stages of training, demonstrating superior image generation quality compared to baseline distillation frameworks.

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

B-DENSE: Branching For Dense Ensemble Network Learning

Related Papers