Search papers, labs, and topics across Lattice.
2
0
5
Vortex achieves up to 4.7 times higher throughput for large language models, revolutionizing how researchers can prototype and evaluate sparse attention algorithms.
Ditch the slow lane: $R^2$-dLLM turbocharges diffusion language models by slashing decoding steps by up to 75% without sacrificing quality.