Search papers, labs, and topics across Lattice.
6
0
10
Navigating with fewer than 8 VLM calls per episode, Goal2Pixel redefines efficiency in vision-language navigation tasks.
Bidirectional interaction between enhanced understanding, controllable spatial editing, and novel-view-assisted reasoning enables a unified multimodal model to achieve spatial intelligence beyond general visual competence.
Bridging the gap between human manipulation and robotic control, JoyAI-RA unlocks enhanced cross-embodiment behavior learning through multi-source pretraining.
Achieve 200x faster immersed boundary flow simulations without sacrificing accuracy by learning to correct coarse-grained physics simulations with a neural network.
Unlock multimodal interleaved generation in existing vision-language models without large interleaved datasets using a novel reinforcement learning approach with hybrid rewards.
Forget monolithic action decoders: AtomicVLA's skill-guided mixture-of-experts unlocks significant gains in long-horizon robotic manipulation and continual learning.