Search papers, labs, and topics across Lattice.
3
0
5
12
Leaderboard-topping video models are still surprisingly brittle, failing on basic video reasoning tasks unless given the right textual cues.
Forget dialogue summaries – FileGram builds user profiles directly from atomic file-system actions, unlocking a richer, more privacy-preserving approach to agent personalization.
Today's best multimodal models can only solve half of compositional visual tool-use tasks, revealing a critical gap in their ability to plan and execute complex, multi-step visual reasoning.