Search papers, labs, and topics across Lattice.
1
0
3
0
A single spatial token, learned via occupancy prediction on a massive dataset, is surprisingly effective at injecting crucial spatial awareness into vision-language navigation, leading to state-of-the-art performance.