Search papers, labs, and topics across Lattice.
21 papers from CMU Machine Learning on World Models & Planning
LLMs in embodied environments get a massive boost from structured rules, with rule retrieval alone contributing +14.9 pp to single-trial success.
By fusing IMU and insole pressure data within a physics simulation, GRIP achieves more physically plausible human motion capture than IMU-only methods.
Accurately simulating the snap-fit mechanics of interlocking bricks, BrickSim unlocks a new level of realism for robotic manipulation research involving complex assemblies.
Forget expensive real-world data collection: a massive, diverse synthetic dataset enables surprisingly effective zero-shot transfer for robotic manipulation.
Strategic recovery from failures is key to deploying robots for complex assembly tasks in the real world.
Injecting muscle synergy priors into reinforcement learning drastically improves the realism of simulated human locomotion, even with limited real-world data.
Panoramic depth perception and differentiable physics unlock surprisingly robust collision avoidance, even generalizing to unseen simulation environments.
Unlock up to 59x cost reductions in optimization by pretraining ML surrogates with cheap, imperfect labels and then refining them with self-supervision.
Unsupervised discovery of object keypoints and dynamics directly from video unlocks state-of-the-art world models applicable to decision-making.
Skip the motion-capture grind: train your hip exoskeleton controller entirely in simulation and still see it work on real hardware.
Achieve real-time safe control of complex robots by representing their dynamics as a linear system in a higher-dimensional space, enabling fast quadratic programming for both tracking and obstacle avoidance.
Forget hand-engineering world models – this work proves that competent agents *must* internally represent the world in a structured, predictive way to minimize regret under uncertainty.
Legged robots can now tiptoe around your expensive gadgets, thanks to a new RL framework that combines semantic understanding with low-level control to avoid stepping on designated objects.
VLA models struggle with physical reasoning, but Pri4R's simple trick of predicting 3D point tracks during training boosts performance by up to 40% on manipulation tasks, without adding any inference overhead.
By pausing to "think" with latent diffusion, STAR-LDM achieves superior language understanding, narrative coherence, and controllable generation compared to standard autoregressive models of similar size.
Robots can now perform intricate assembly tasks and recover from errors in real-time, without any training, by fusing vision-language models with video-based kinematic priors for action planning.
Robots can now adapt their safety behavior on the fly in response to changing real-world contexts, without needing pre-programmed rules or maps.
Stop guessing about AGV fleet management: LSMART offers a realistic, open-source simulator to benchmark MAPF algorithms in complex, lifelong scenarios, revealing the critical design choices that make or break performance.
Robots can now learn long-horizon tasks far more effectively by distilling complex histories into a few key visual moments, outperforming standard imitation learning by 70% on real-world tasks.
RynnBrain leapfrogs existing embodied foundation models, offering a unified, open-source spatiotemporal model that excels at physically grounded reasoning and planning across a wide range of benchmarks.
Forget rigid game environments – PAN lets you simulate open-world scenarios with language-specified actions and long-term visual coherence, opening the door to more realistic AI training.