Search papers, labs, and topics across Lattice.
2
0
5
Nemotron 3 Super proves you can achieve comparable accuracy to existing 120B models, but with significantly higher inference throughput, by combining Mamba, Attention, and Mixture-of-Experts.
Training trillion-parameter Mixture-of-Experts models just got a whole lot faster: Megatron Core now achieves >1 PFLOP/GPU on NVIDIA's latest hardware.