Search papers, labs, and topics across Lattice.
5
0
8
8
Leaderboard-topping video models are still surprisingly brittle, failing on basic video reasoning tasks unless given the right textual cues.
Forget dialogue summaries – FileGram builds user profiles directly from atomic file-system actions, unlocking a richer, more privacy-preserving approach to agent personalization.
Achieve 9x lower trajectory error and 3x better FID in motion generation by using a diffusion-based discrete motion tokenizer that elegantly handles both semantic and kinematic constraints.
Unlock real-time 3D understanding: MonoArt achieves state-of-the-art monocular articulated object reconstruction without relying on multi-view data or external motion templates.
Today's best multimodal models can only solve half of compositional visual tool-use tasks, revealing a critical gap in their ability to plan and execute complex, multi-step visual reasoning.