Search papers, labs, and topics across Lattice.
4
0
6
2
Unlock the full potential of your pretrained video diffusion models with a surprisingly simple four-stage post-training framework that drastically improves visual quality, temporal coherence, and instruction following.
Linear transport flows between degraded and clean image domains enable fast, adaptable image restoration that outperforms existing methods in distortion-perception balance.
RL fine-tuning of hybrid autoregressive-diffusion models can be made significantly more stable and effective by averaging gradients across multiple diffusion trajectories and filtering autoregressive tokens for consistency.
Achieve real-time, synchronized audio-visual generation at 25 FPS by distilling a bidirectional diffusion model into a fast, autoregressive architecture, overcoming training instability with novel alignment and token handling techniques.