Search papers, labs, and topics across Lattice.
3
0
6
Unlock geometric reasoning in MLLMs by parsing diagrams into a unified formal language that spans both 2D and 3D geometry.
A 3B model can match the performance of models more than twice its size in mobile GUI automation by distilling visual history into concise natural language summaries.
Current VLM-driven embodied agents struggle with fundamental skills like navigation and object manipulation when evaluated in realistic, low-level action spaces, severely hindering their performance on complex tasks.