Search papers, labs, and topics across Lattice.
3
0
7
Forget trying to wrangle dynamic 4D scenes with recurrent networks – DynamicVGGT achieves state-of-the-art reconstruction accuracy using a surprisingly effective feed-forward approach.
Pre-training on universal 3D poses lets robots learn new tasks from just 100 demonstrations, sidestepping the usual VLA efficiency bottleneck.
MLLMs can "hear" a little, but EgoSound reveals they're still largely deaf to the nuances of sound in egocentric video, especially when it comes to spatial and causal reasoning.