Search papers, labs, and topics across Lattice.
2
0
4
2
Ditch slow, irrelevant text-based reasoning: VISUALTHINK-VLA uses visual tokens to speed up vision-language-action policies by 22x while boosting accuracy.
VLA models may excel at visually grounded tasks, but VLA-Trace reveals they still struggle with fine-grained semantic understanding and exhibit distinct modality processing strategies.