Search papers, labs, and topics across Lattice.
4
0
8
9
A 7B model trained with RL can outperform 72B-scale general MLLMs in robotic manipulation process supervision by explicitly reasoning about progress toward the final task goal.
Achieve real-time embodied manipulation with large 3D vision models using a novel asynchronous architecture that boosts success rates by up to 51.4% while simultaneously reducing inference time.
Forget short-term context windows: VPWEM's Transformer-based memory compressor lets robots ace long-horizon manipulation tasks by distilling past observations into fixed-size episodic memories.
Bimanual robots can now achieve robust dexterous grasping in the real world, thanks to a massive 20M-frame synthetic dataset and a simple attention-based policy that transfers surprisingly well.