Search papers, labs, and topics across Lattice.
4
1
7
1
Aligning diffusion models with just 100 carefully selected samples can beat state-of-the-art preference optimization methods trained on thousands, and converge up to 220x faster.
K-means, previously relegated to offline processing, gets a 17.9x speed boost on modern GPUs thanks to Flash-KMeans' clever IO and contention optimizations.
Get 2x faster video generation from diffusion transformers without sacrificing quality, thanks to a clever parameter-free error compensation technique.
By explicitly detecting and escaping "Forbidden Zones" during training, AMD unlocks significant gains in sample fidelity and training robustness for few-step generative models like SDXL.