Search papers, labs, and topics across Lattice.
Chinese Academy of Sciences
2
0
4
VLMs can achieve state-of-the-art Vision-Language Navigation performance by explicitly training them to reason about past actions and predict future visual transitions.
Open-source VLN agents can nearly double their navigation success by remembering where they've been, thanks to a new hierarchical memory system.