Search papers, labs, and topics across Lattice.
1
0
3
Masked diffusion language models can now achieve 21.8x better compute efficiency than autoregressive models, thanks to binary encoding and index shuffling.