Search papers, labs, and topics across Lattice.
5
0
8
LCLMs redefine the efficiency of long-context inference, achieving superior compression without sacrificing model quality.
Even the top-performing MLLMs struggle with visual reasoning, achieving only 64% accuracy on a benchmark designed to reflect real-world diversity.
Forget complex architectures and task-specific designs: VLMs are already native 3D learners with the right training recipe.
VLMs can get a 10% boost in spatial reasoning and 3D understanding by training on just 10,000 synthetic images generated automatically from task keywords.
Open-sourcing Vero, a VLM trained with RL on a diverse 600K-sample dataset, closes the performance gap with proprietary models and reveals that broad task coverage, not just scale, is the key to unlocking general visual reasoning.