Search papers, labs, and topics across Lattice.
2
0
5
0
Forget scaling reasoning – this work shows that scaling visual perception using code-grounded data is the real key to unlocking MLLMs' STEM abilities.
Multimodal models are often blind at birth: a new "Visual Attention Score" reveals they struggle to focus on visual inputs during cold-start, but a simple attention-guided fix can boost performance by 7%.