Search papers, labs, and topics across Lattice.
1
0
3
Uniform-state diffusion models, often overlooked in favor of masked diffusion, surprisingly outperform autoregressive and masked diffusion models on GSM8K when scaled to 1.7B parameters, despite worse perplexity.