Search papers, labs, and topics across Lattice.
3
0
6
5
Multimodal models can now achieve state-of-the-art performance in real-world tasks like document understanding and audio-video comprehension with significantly reduced inference latency thanks to novel token-reduction techniques.
Nemotron 3 Super proves you can achieve comparable accuracy to existing 120B models, but with significantly higher inference throughput, by combining Mamba, Attention, and Mixture-of-Experts.
By decoupling MLLM instruction tuning from DiT alignment, DuoGen achieves state-of-the-art interleaved multimodal generation without costly unimodal pretraining.