Search papers, labs, and topics across Lattice.
2
16
6
3
Late-interaction retrieval just got a whole lot faster and cheaper: Flash-MaxSim slashes memory usage by 16x and speeds up inference by 4.7x on an H100 by ditching the massive similarity tensor.
A new 2B parameter vision-language model, Granite Vision, rivals larger models on visual document understanding tasks while offering a transparent and commercially-friendly open-source license.