Search papers, labs, and topics across Lattice.
3
0
4
0
Achieve zero-shot cross-embodiment visual tracking by dynamically adapting control policies to inferred embodiment constraints, eliminating the need for per-robot training.
Pruning 90% of visual tokens without sacrificing performance could revolutionize the efficiency of 3D scene understanding in multimodal models.
Forget painstakingly annotating paired images: VGGT-Segmentor achieves state-of-the-art cross-view segmentation by cleverly pretraining on single images, sidestepping the need for correspondence.