Search papers, labs, and topics across Lattice.
Achieve 49% and 19% better Chamfer distance than state-of-the-art dynamic surface reconstruction methods on Hi4D and CMU Panoptic datasets, respectively, by enforcing temporal consistency in Gaussian Splatting.
Humanoid robots can now traverse complex terrains with human-like gaits, thanks to a surprisingly simple and efficient framework that eschews adversarial training.
Robots can now manipulate objects with greater dexterity and adaptability thanks to a new world model that leverages both vision and high-frequency tactile feedback to predict and react to contact dynamics.
By explicitly reasoning in 3D, VolumeDP leaps ahead of 2D-based imitation learning methods, achieving a remarkable 14.8% improvement on the LIBERO benchmark and robust real-world generalization.
Ditch fixed compute budgets: this new flow-matching method for robotic control adaptively allocates computation, speeding up simple tasks and focusing on complex ones.
Stop wrestling with incompatible human body models: SOMA lets you mix and match SMPL, SMPL-X, and more, unlocking the power of diverse datasets in a single, differentiable pipeline.
Forget expensive real-world data collection: a massive, diverse synthetic dataset enables surprisingly effective zero-shot transfer for robotic manipulation.
A hybrid cuVSLAM-based visual SLAM system achieves superior mapping accuracy in real-world logistics environments, outperforming other VO/VSLAM approaches.
World Action Models can ditch the slow, iterative "imagine-then-execute" loop at test time without sacrificing performance, achieving a 4x speedup.
Humanoid robots can now handle heavy, unknown payloads in the real world thanks to a system that identifies mass distribution via differentiable simulation.
Kimodo leaps ahead in controllable human motion generation by training a diffusion model on a massive 700-hour mocap dataset, enabling unprecedented control fidelity via text and diverse kinematic constraints.
Forget slow, model-dependent curation: FAKTUAL offers a fast, model-free way to boost robot imitation learning by directly maximizing the entropy of demonstration datasets.
Achieve a 40% jump in success rates on real-world contact-rich manipulation by intelligently scheduling force feedback into visual-motor policies.
Forget predefined areas of interest: this multi-agent exploration framework uses Gaussian belief mapping to adaptively balance scientific discovery and safety in hazardous off-world environments.
Human-robot teams can slash interaction costs by 50% and task times by 25% when robots actively resolve uncertainty about tasks and infer human intent using LLMs and spatial reasoning.
By combining video generation and vision-language models, EmboAlign achieves a 43% boost in real-world robot manipulation success without any task-specific training.
Training generalist robots just got a whole lot easier: RoboCasa365 offers a massive, diverse, and reproducible benchmark for household mobile manipulation.
Forget simulated manipulation—ManipulationNet offers a global infrastructure for benchmarking robots in the real world, complete with standardized hardware and software, to finally measure progress toward general manipulation.
Forget everything you thought you knew about continual learning: pretrained Vision-Language-Action models can learn new robotic skills without catastrophic forgetting, even with minimal replay.
Learning robotic reward functions from a million trajectories reveals that comparing entire trajectories, not just individual frames, unlocks better generalization and learning from suboptimal data.
Achieve up to 28% better success rates in whole-body mobile manipulation by decoupling base and arm control while intelligently allocating perceptual attention.
ShallowConvNet emerges as a surprisingly effective architecture for decoding user intent from EEG signals in real-world robotic control, outperforming more complex models like Transformers.
Unlock robot learning with hidden knowledge: TOPReward extracts surprisingly accurate task progress signals directly from VLM token probabilities, bypassing the need for explicit reward engineering.
Forget painstakingly engineering robot behaviors: DreamZero learns directly from video of other robots or even humans, adapting to new tasks and bodies with just minutes of data.
Forget robotics pre-training: ActionCodec, a new action tokenizer designed with information-theoretic principles, achieves state-of-the-art VLA performance on LIBERO.
Forget complex architectures: RaCo achieves SOTA keypoint matching and repeatability by cleverly combining ranking and covariance estimation in a lightweight network, trained without covisible image pairs.
Key contribution not extracted.
Forget static datasets – RL-based co-training unlocks +20% real-world VLA performance by interactively leveraging simulation while preserving real-world capabilities.
Training a robot foundation model on 30,000 hours of heterogeneous embodied data lets it outperform prior methods by up to 48% on complex manipulation tasks and even benefit from low-quality data.
Forget tedious manual segmentation: ArtisanGS lets you lasso objects in 3D Gaussian Splats with AI-powered 2D selections that propagate into 3D, giving you unprecedented control over editing.
Forget synthetic data that looks like it came from a PS2 game: NVIDIA's new Cosmos-Predict2.5 generates high-fidelity videos for training embodied AI, opening the door to more realistic and reliable simulations.
Ditch slow iterative refinement: conditional flow-matching models can directly learn meaningful proposal distributions from noisy sampling-based MPC data, slashing planning time.
Imagine training robots to manipulate objects in the real world, but entirely within a high-fidelity, diffusion-based dream.