Search papers, labs, and topics across Lattice.
2
0
5
0
Current MLLMs struggle with fine-grained spatial reasoning, achieving only 37.2 F1 on challenging tasks compared to human performance of 84.0 F1.
Seemingly innocuous choices in loss functions and training regimes can significantly hinder visual geometry estimation, even for state-of-the-art methods.