Search papers, labs, and topics across Lattice.
3
0
5
Achieving six times the inference throughput of current LLMs while maintaining accuracy, Nemotron 3 Ultra redefines performance benchmarks for agentic reasoning tasks.
Ditch the slow lane: $R^2$-dLLM turbocharges diffusion language models by slashing decoding steps by up to 75% without sacrificing quality.
Nemotron 3 Super proves you can achieve comparable accuracy to existing 120B models, but with significantly higher inference throughput, by combining Mamba, Attention, and Mixture-of-Experts.