Search papers, labs, and topics across Lattice.
3
0
8
1
Multimodal models can now achieve state-of-the-art performance in real-world tasks like document understanding and audio-video comprehension with significantly reduced inference latency thanks to novel token-reduction techniques.
Autonomous coding agents can now outperform expert-engineered attention kernels on NVIDIA's latest Blackwell GPUs, discovering optimizations that eluded human experts.
Agentic AI systems are still far from maximizing hardware potential: SOL-ExecBench reveals a significant gap between current GPU kernel performance and analytically derived Speed-of-Light bounds across a wide range of AI models.