Search papers, labs, and topics across Lattice.
3
0
6
VLMs still struggle to grasp spatial relationships in dynamic sports scenes, as evidenced by a new benchmark revealing a significant human-AI performance gap.
Forget hand-annotated 3D datasets: a new automated pipeline generates massive, high-quality 3D spatial intelligence from raw video, unlocking better VLM reasoning.
Fine-tuning LLMs on datasets filtered at the token level, rather than the sentence level, can boost performance by up to 13.7%.