Search papers, labs, and topics across Lattice.
Fudan University
6
0
7
0
Robots get a crucial boost in robustness by learning to "see" and predict how objects will move, not just react to the current frame.
DINO-VO's learned patch selection and differentiable bundle adjustment leapfrogs traditional heuristic feature extraction, achieving SOTA monocular visual odometry with impressive generalization.
Teaching robots to manipulate objects just got easier: OCRA learns directly from human demonstration videos by focusing on object interactions and incorporating tactile feedback.
Forget trying to wrangle dynamic 4D scenes with recurrent networks – DynamicVGGT achieves state-of-the-art reconstruction accuracy using a surprisingly effective feed-forward approach.
Pre-training on universal 3D poses lets robots learn new tasks from just 100 demonstrations, sidestepping the usual VLA efficiency bottleneck.
MLLMs can "hear" a little, but EgoSound reveals they're still largely deaf to the nuances of sound in egocentric video, especially when it comes to spatial and causal reasoning.