Search papers, labs, and topics across Lattice.
4
0
4
0
Unlock the power of web videos for embodied AI: implicit geometry representations let agents learn to navigate from real-world room tours without relying on fragile 3D reconstruction.
Visuomotor policies can learn to ignore distracting visual variations simply by preprocessing raw RGB images into task-aware, semantic-geometric representations *before* feeding them to the policy.
Forget monolithic action decoders: AtomicVLA's skill-guided mixture-of-experts unlocks significant gains in long-horizon robotic manipulation and continual learning.
Reconstructing realistic 3D hand avatars from messy, real-world video just got a whole lot better thanks to a new method that explicitly models and suppresses visual "noise" like motion blur and object interactions.