Search papers, labs, and topics across Lattice.
2
0
3
SR-REAL reveals that combining linguistic deduction with 3D geometric reasoning can dramatically enhance spatial reasoning capabilities in VLMs.
Grounding boosts spatial reasoning in VLMs: explicitly linking language to 2D and 3D scene elements lets models decompose complex spatial problems and improve performance even on non-grounded tasks.