Search papers, labs, and topics across Lattice.
Internal world models for prediction, model-based planning, simulation, and environment modeling in AI systems.
#12 of 24
4
Interactive 3D asset generation can now be driven by functional logic and hierarchical physics, thanks to a new framework that synthesizes simulation-ready assets.
Predictive representation learning fundamentally fails to learn causal system dynamics, instead latching onto environmental correlations, even when it hurts prediction accuracy.
Exponentially many policies in Tree MDPs don't have to mean exponential computation: clever confidence bounds let you treat policy selection as a tractable bandit problem.
You don't need a full causal graph to avoid undesired outcomes; learning a simple order structure can be enough, and even outperform methods that try to learn the whole graph.
Reservoir Computing offers a surprisingly effective way to build Koopman dictionaries for nonlinear system identification, sidestepping the usual dictionary selection and ill-conditioning problems.
Token embedding geometry isn't just abstract math—it directly mirrors how language models internally represent and reason about the world, as shown by its alignment with board state and piece importance in chess.
Average reward RL can finally handle the messy reality of non-stationary rewards and durations in SMDPs, thanks to a clever harmonic mean trick.
Control-dependent latent dynamics, achieved with a surprisingly small parameter increase, unlock robust MPC performance in time-varying environments where standard Koopman methods falter.
Aligning random seeds across rollout simulations can significantly boost the performance of simulation-based planning, even in complex environments like Ludo.
Achieve robust long-horizon visual control by adaptively balancing model-based lookahead with bootstrapping, enabling zero-shot transfer to real-world tasks with severe occlusions.
Verifier-driven executable world models can solve complex reasoning tasks like ARC-AGI-3 without game-specific code, hinting at a path towards more generalizable AI agents.
Forget hand-crafted reward functions: this RL framework lets a bicycle robot learn complex stunts from just a spatial guideline and a few key poses.
Predicting driver behavior in response to traffic conditions is now possible with a new world model that causally links external context to internal driver states.
Gradient-based MPC can finally beat gradient-free methods in continuous control, thanks to Dream-MPC's clever combination of learned policies, world models, uncertainty regularization, and optimization amortization.
Optimizing wildfire suppression via integer programming and machine learning can significantly reduce burned areas and improve resource allocation, offering a data-driven approach to a critical real-world problem.
Get high-fidelity tactile simulations with 65% speedup and 40% less memory by combining coarse physics with neural implicit reconstruction.
Autonomous driving gets a boost: CRAFT cleverly combines the best of both worlds – dense counterfactual supervision and grounded closed-loop feedback – to significantly improve driving policies.
RL fine-tuning unlocks a 6x performance gain for in-place trajectory editing in autonomous driving, demonstrating the power of aligning diffusion planners with reinforcement learning.
Stop relying on significance tests that only find differences: this Bayesian framework tells you if your synthetic data is *practically equivalent* to real-world data for your specific safety assessment task.
Achieve real-time bipedal walking control by cleverly swapping high-fidelity for low-fidelity models in MPC, slashing computation without sacrificing stability.
Diffusion models can now plan effectively for long-horizon tasks by strategically generating subgoals that are then efficiently realized by rectified flow models.
Generate more realistic and diverse safety-critical autonomous vehicle scenarios by using conditional latent flow matching to bridge the gap between real-world and simulated data.
Dynamically adjusting trajectory optimization based on real-time navigation confidence enables robust low-thrust rendezvous, slashing miss distances by two orders of magnitude when faced with degraded sensor data.
Current video generation benchmarks overlook crucial aspects of physical plausibility and temporal coherence, highlighting the need for holistic evaluation metrics like PhyScore.
Current world models struggle with basic physical interaction tasks like distance perception and trajectory following, highlighting a critical gap in their ability to simulate realistic environments.
LLM-powered simulations can train cyberbullying intervention, but only after users overcome key attention deficits that prevent them from recognizing the need for public action.
Predicting ICU admission risk from patient data improves from AUC 0.642 to 0.942 as more clinical events are observed, highlighting the value of continuous, dynamically aware predictive monitoring.
End-to-end learning can beat even the best industrial solvers at multi-agent task assignment, improving solution quality by 20% while slashing computation time from hours to seconds.
Quadrupedal robots can now perform dynamic loco-manipulation in the real world, matching human teleoperation, using only onboard ego-centric vision and a low-frequency (5Hz) open-vocabulary detector.
Guaranteeing safety and liveness in complex control systems doesn't require monolithic design; this work shows how to decompose the problem across layers with formal contracts.
Reactive dexterous grasping can be achieved with zero-shot transfer to real-world objects by decoupling high-level RL planning from low-level QP control, enabling dynamic adjustments to safety margins without retraining.
Achieve 15% faster order completion in warehouse robotics with a new deep reinforcement learning approach that jointly optimizes robot scheduling and order allocation in real-time.
Robot video world models can be significantly improved by distilling a multimodal reward function and stabilizing long-horizon inference, leading to better instruction following and manipulation accuracy.
Escape deadlocks and choreograph robots through complex tasks with this new hybrid control architecture that merges planning and control.
Domain randomization doesn't just make your robot policies more robust; it fundamentally warps the optimization landscape, potentially guiding your search towards better contact-rich behaviors.
Differentiating through physical simulations just got a whole lot easier: Neural Control avoids unrolling iterative solvers by using an adjoint formulation, enabling memory-efficient gradient-based control.
Autonomous vehicles can now stick to the plan even with disturbances, thanks to a new control method that learns and compensates for unmodeled dynamics.
Future power grids can learn from human cognition and octopus intelligence to build more robust and responsive decision-making systems.
Encoding temporal prediction into video VAEs unlocks faster training, better generative performance, and improved downstream task performance, all at once.
Forget grid layouts: Map2World lets you generate consistent 3D worlds from arbitrary segment maps, offering unprecedented control and scalability.
Simply detecting distribution shifts in visual MBRL is easy; the real challenge is applying the right action-level corrections, which this paper tackles with a novel local expert growth strategy.
Even the most advanced language models still lose money and demonstrate unsophisticated strategies when tasked with maximizing long-term bankroll growth in a realistic sports betting simulation, highlighting a significant gap in their sequential decision-making capabilities.
Semantic rollouts and town-adversarial regularization can significantly boost zero-shot driving performance in unseen CARLA towns, even without explicit navigation commands or map inputs.
Polymorph selection in metal-organic frameworks happens surprisingly early, starting at the pre-nucleation cluster stage.
Forget per-scene optimization: GenWildSplat achieves state-of-the-art 3D reconstruction from sparse, unposed images in real-time using a purely feed-forward approach.
Achieve state-of-the-art open-vocabulary occupancy prediction without any training data, outperforming supervised and self-supervised methods by a large margin.
Control over physical properties like friction and restitution in generated videos is now possible, paving the way for more realistic and controllable video synthesis.
Today's visual generation models are often evaluated on the wrong things, leading to inflated performance claims that mask critical failures in spatial reasoning, temporal consistency, and causal understanding.
Reconstructing real-world scenes in Minecraft unlocks a customizable embodied AI playground, but only if we can solve the occupancy prediction bottleneck – and this new dataset shows we're not there yet.
Forget painstakingly programming robot interactions – ExoActor uses video generation to hallucinate plausible behaviors, then translates them into robot actions.