Search papers, labs, and topics across Lattice.
37 papers from CMU Machine Learning on Robotics & Embodied AI
LLMs in embodied environments get a massive boost from structured rules, with rule retrieval alone contributing +14.9 pp to single-trial success.
Forget retargeting: RoboForge's physics-optimized pipeline lets humanoids nail text-guided locomotion with better accuracy and stability.
LLMs can navigate complex 3D environments more effectively and with far fewer tokens by using a hierarchical scene graph representation derived from omnidirectional sensor data.
By fusing IMU and insole pressure data within a physics simulation, GRIP achieves more physically plausible human motion capture than IMU-only methods.
Accurately simulating the snap-fit mechanics of interlocking bricks, BrickSim unlocks a new level of realism for robotic manipulation research involving complex assemblies.
Forget expensive real-world data collection: a massive, diverse synthetic dataset enables surprisingly effective zero-shot transfer for robotic manipulation.
Strategic recovery from failures is key to deploying robots for complex assembly tasks in the real world.
Forget hand-tuning controllers for each new linear system: a single transformer can learn near-optimal control policies across diverse MIMO LTI systems.
AssistMimic enables humanoid robots to learn complex, force-exchanging assistive motions by reformulating imitation learning as a multi-agent RL problem.
Injecting muscle synergy priors into reinforcement learning drastically improves the realism of simulated human locomotion, even with limited real-world data.
Unmasked policy gradient methods can inadvertently suppress valid actions in unvisited states, creating a hidden exploration bottleneck that masking neatly avoids.
Forget manual labeling: influence functions can automatically surface high-quality robot demonstrations, boosting policy performance by intelligently curating training data.
Panoramic depth perception and differentiable physics unlock surprisingly robust collision avoidance, even generalizing to unseen simulation environments.
Unsupervised discovery of object keypoints and dynamics directly from video unlocks state-of-the-art world models applicable to decision-making.
Skip the motion-capture grind: train your hip exoskeleton controller entirely in simulation and still see it work on real hardware.
Robots can now achieve stable, compliant object transport in unstructured environments, even with strong and unpredictable interaction forces, thanks to a bio-inspired control framework that separates interaction execution from support control.
HALyPO stabilizes human-robot collaboration by directly certifying the convergence of decentralized policy learning in parameter space, sidestepping the oscillations that plague standard MARL approaches.
Achieve real-time safe control of complex robots by representing their dynamics as a linear system in a higher-dimensional space, enabling fast quadratic programming for both tracking and obstacle avoidance.
Forget simulated manipulation—ManipulationNet offers a global infrastructure for benchmarking robots in the real world, complete with standardized hardware and software, to finally measure progress toward general manipulation.
By disentangling camera-space estimation from world-space refinement via dual diffusion models, DuoMo achieves state-of-the-art human motion reconstruction from noisy video, bypassing the limitations of parametric models.
Legged robots can now tiptoe around your expensive gadgets, thanks to a new RL framework that combines semantic understanding with low-level control to avoid stepping on designated objects.
VLA models struggle with physical reasoning, but Pri4R's simple trick of predicting 3D point tracks during training boosts performance by up to 40% on manipulation tasks, without adding any inference overhead.
Achieve 7x accuracy gains in real-world collaborative SLAM by using a robust, distributed optimization algorithm resilient to communication limits and noisy data.
By decomposing long-horizon manipulation into transport and object-centric interaction, LiLo-VLA achieves state-of-the-art zero-shot generalization and robustness, outperforming end-to-end VLA models by a large margin.
Forget language and appearance: CAD models can now directly prompt accurate instance segmentation of industrial objects, even with diverse surface properties.
Robots can now perform intricate assembly tasks and recover from errors in real-time, without any training, by fusing vision-language models with video-based kinematic priors for action planning.
Unlabeled monocular videos can now be used to train state-of-the-art 3D/4D reconstruction systems, thanks to a factored flow prediction approach that disentangles geometry and pose learning.
Forget trial-and-error: this work provides a theoretical recipe for scaling neural Koopman operators, showing how to optimally allocate effort between data collection and model capacity for robotic control.
Robots can now adapt their safety behavior on the fly in response to changing real-world contexts, without needing pre-programmed rules or maps.
Modularity in HRI isn't just about interchangeable parts; it's a powerful design medium for fostering long-term, evolving relationships between humans and robots.
Imagine giving robots a sense of touch as sensitive as a spiderweb, using nothing more than vibrating strings and microphones.
Stop guessing about AGV fleet management: LSMART offers a realistic, open-source simulator to benchmark MAPF algorithms in complex, lifelong scenarios, revealing the critical design choices that make or break performance.
Forget clunky skeletons: this new model lets you prompt your way to accurate 3D human meshes from single images, even in the wildest poses.
Robots can now learn long-horizon tasks far more effectively by distilling complex histories into a few key visual moments, outperforming standard imitation learning by 70% on real-world tasks.
Key contribution not extracted.
Forget static datasets – RL-based co-training unlocks +20% real-world VLA performance by interactively leveraging simulation while preserving real-world capabilities.
RynnBrain leapfrogs existing embodied foundation models, offering a unified, open-source spatiotemporal model that excels at physically grounded reasoning and planning across a wide range of benchmarks.