Search papers, labs, and topics across Lattice.
1
0
3
Block diffusion LLMs can achieve near-lossless long-context performance with 3-4x speedups by using attention patterns learned from the first denoising step, unlocking efficient sparse attention without autoregressive approximations.