Search papers, labs, and topics across Lattice.
2
0
4
1
Current MLLMs struggle with fine-grained spatial reasoning, achieving only 37.2 F1 on challenging tasks compared to human performance of 84.0 F1.
Generative training not only enhances a model's ability to manipulate objects in images, but also surprisingly strengthens its spatial reasoning skills.