Search papers, labs, and topics across Lattice.
2
57
6
6
Unlock human-like spatial reasoning in VLMs with VLM-3R, which reconstructs 3D understanding from monocular video using instruction tuning, bypassing the need for external depth sensors.
Using preference data from stronger models to align LLMs via DPO can backfire, dramatically worsening safety by making models more susceptible to jailbreaking.