Search papers, labs, and topics across Lattice.
2
0
3
2
Finally, a single model that handles any segmentation task in both images and videos, understanding both text and visual prompts.
Static geometric features are holding back MLLMs' spatial reasoning abilities; GeoAlign's dynamic, content-aware routing of multi-layer features unlocks SOTA performance with a compact model.