Search papers, labs, and topics across Lattice.
VLN agents can navigate more accurately in zero-shot settings by "looking forward, now, and backward," mimicking human navigational strategies.
Existing robotic methods falter in tackling fundamental physical reasoning challenges, as evidenced by KinDER's rigorous benchmark evaluation.
Forget clunky animation pipelines – MotionBricks lets you assemble real-time, high-quality character motions like LEGOs, even controlling robots.
Open-vocabulary 3D instance segmentation just got 100x faster, thanks to a new transformer architecture that ditches region proposals and fragmented masks.
Fusing MPC with RL yields safer and more efficient autonomous driving at intersections, outperforming both standalone MPC and end-to-end RL, and surprisingly generalizing better to new scenarios.
RoboLab exposes critical performance gaps in leading robotic models, revealing that high-fidelity simulations can better assess generalization than traditional benchmarks.
Training autonomous vehicles can be dramatically sped up: MOSAIC achieves state-of-the-art driving performance with 80% less data by intelligently selecting training examples based on scaling laws.
Finally, a method disentangles dynamic egocentric scenes into background, hand, and object components, enabling fine-grained understanding and editing.
Swap out slow, one-token-at-a-time generation in VLMs for a 6x speed boost, without sacrificing quality, using a surprisingly simple direct conversion to block-diffusion decoding.
Finally, a video generation model lets you puppeteer objects and their reactions independently, all while freely moving the camera.
Achieve 49% and 19% better Chamfer distance than state-of-the-art dynamic surface reconstruction methods on Hi4D and CMU Panoptic datasets, respectively, by enforcing temporal consistency in Gaussian Splatting.
A hybrid cuVSLAM-based visual SLAM system achieves superior mapping accuracy in real-world logistics environments, outperforming other VO/VSLAM approaches.
Forget slow, model-dependent curation: FAKTUAL offers a fast, model-free way to boost robot imitation learning by directly maximizing the entropy of demonstration datasets.
Training generalist robots just got a whole lot easier: RoboCasa365 offers a massive, diverse, and reproducible benchmark for household mobile manipulation.
Forget simulated manipulation—ManipulationNet offers a global infrastructure for benchmarking robots in the real world, complete with standardized hardware and software, to finally measure progress toward general manipulation.
Learning robotic reward functions from a million trajectories reveals that comparing entire trajectories, not just individual frames, unlocks better generalization and learning from suboptimal data.
Forget tedious manual segmentation: ArtisanGS lets you lasso objects in 3D Gaussian Splats with AI-powered 2D selections that propagate into 3D, giving you unprecedented control over editing.
Forget synthetic data that looks like it came from a PS2 game: NVIDIA's new Cosmos-Predict2.5 generates high-fidelity videos for training embodied AI, opening the door to more realistic and reliable simulations.