Search papers, labs, and topics across Lattice.
3
0
4
0
A single spatial token, learned via occupancy prediction on a massive dataset, is surprisingly effective at injecting crucial spatial awareness into vision-language navigation, leading to state-of-the-art performance.
Forget hand-crafted heuristics: this new dynamics-aware policy learns to exploit contact forces in cluttered environments, outperforming traditional methods by 25% in simulation and showing impressive sim-to-real transfer.
Reconstructing realistic and physically plausible 3D scenes from video now involves actively optimizing viewpoints for object generation and synthesizing scene graphs to guide simulator-based assembly.