Search papers, labs, and topics across Lattice.
3
0
5
4
AffordanceVLA transforms robotic manipulation by using structured affordance cues to create precise perception-action mappings, outperforming traditional models.
Ditching text-based chain-of-thought unlocks better audio-visual reasoning by interleaving textual steps with a unified latent space that preserves dense sensory information.
Finally, a robot can reliably pick out that specific shirt from a messy pile, thanks to a new vision-language pipeline that reasons about garment affordances.