March 18 – March 25, 2026

World Models & Planning - Weekly Roundup

67 papers published across 7 labs.

Selected Labs publishing this week

NVIDIA3 Tsinghua AI3 MIT CSAIL1 CMU ML1 Stanford HAI1

Top Papers

Mar 19, 2026

Hyun-kyu Ko +41w ago

3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model

Forget generating plausible-but-fake details: 3DreamBooth bakes a robust 3D prior into video generation models using only a single-frame optimization, enabling truly view-consistent customized subject videos.

Hyun-kyu Ko, Jihyeon Park, Younghyun Kim +2

Computer Vision Multimodal Models World Models & Planning

Chun-Jui Wang +61w ago

Evaluating Game Difficulty in Tetris Block Puzzle

Adding the T-pentomino to Tetris Block Puzzle makes the game significantly harder, quantified by a slowdown in SGAZ agent convergence.

Chun-Jui Wang, Jian-Ting Guo, Hung Guei +4

Eval Frameworks & Benchmarks World Models & Planning

Adrien Bolland +21w ago

Maximum-Entropy Exploration with Future State-Action Visitation Measures

Maximizing entropy of future state-action visitations boosts feature coverage within single RL trajectories, offering a new exploration strategy.

Adrien Bolland, Gaspard Lambrechts, Damien Ernst

Robotics & Embodied AI World Models & Planning

Anastasios Manganaris +31w ago

Graph-of-Constraints Model Predictive Control for Reactive Multi-agent Task and Motion Planning

Coordinating multi-robot teams to complete manipulation tasks just got easier: GoC-MPC handles dynamic task assignments and disturbances without training data or environment models.

Anastasios Manganaris, Jeremy Lu, A. H. Qureshi +1

Robotics & Embodied AI Tool Use & Agents World Models & Planning

Marcio Augusto Sampaio +51w ago

Enhancing the Parameterization of Reservoir Properties for Data Assimilation Using Deep VAE-GAN

VAE-GANs let you have your cake and eat it too: high-fidelity geological models *and* accurate history matching in reservoir simulation, something previous DL methods couldn't deliver.

Marcio Augusto Sampaio, M. A. Sampaio, Paulo Henrique Ranazzi +3

Scientific Discovery & Drug Design World Models & Planning

All Papers (67)

Mar 19, 2026

Hyun-kyu Ko +41w ago

3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model

Hyun-kyu Ko, Jihyeon Park, Younghyun Kim +2

Computer Vision Multimodal Models World Models & Planning

Chun-Jui Wang +61w ago

Evaluating Game Difficulty in Tetris Block Puzzle

Adding the T-pentomino to Tetris Block Puzzle makes the game significantly harder, quantified by a slowdown in SGAZ agent convergence.

Chun-Jui Wang, Jian-Ting Guo, Hung Guei +4

Eval Frameworks & Benchmarks World Models & Planning

Adrien Bolland +21w ago

Maximum-Entropy Exploration with Future State-Action Visitation Measures

Maximizing entropy of future state-action visitations boosts feature coverage within single RL trajectories, offering a new exploration strategy.

Adrien Bolland, Gaspard Lambrechts, Damien Ernst

Robotics & Embodied AI World Models & Planning

Anastasios Manganaris +31w ago

Graph-of-Constraints Model Predictive Control for Reactive Multi-agent Task and Motion Planning

Coordinating multi-robot teams to complete manipulation tasks just got easier: GoC-MPC handles dynamic task assignments and disturbances without training data or environment models.

Anastasios Manganaris, Jeremy Lu, A. H. Qureshi +1

Robotics & Embodied AI Tool Use & Agents World Models & Planning

Marcio Augusto Sampaio +51w ago

Enhancing the Parameterization of Reservoir Properties for Data Assimilation Using Deep VAE-GAN

VAE-GANs let you have your cake and eat it too: high-fidelity geological models *and* accurate history matching in reservoir simulation, something previous DL methods couldn't deliver.

Marcio Augusto Sampaio, M. A. Sampaio, Paulo Henrique Ranazzi +3

Scientific Discovery & Drug Design World Models & Planning

1w ago·also Jagannath University

Unmasking Algorithmic Bias in Predictive Policing: A GAN-Based Simulation Framework with Multi-City Temporal Analysis

Predictive policing algorithms can exhibit extreme racial bias, with one city showing a 157x higher detection rate for one racial group in a single year.

Pronob Kumar Barman, P. K. Barman, Pronoy Kumar Barman

Constitutional AI & AI Ethics Data Curation & Synthetic Data World Models & Planning

Dario Compagno +21w ago·also University of Bergen

Teleological Inference in Structural Causal Models via Intentional Interventions

Discovering an agent's hidden intentions is now possible by analyzing their interventions within a causal model, revealing the "why" behind their actions.

Dario Compagno, D. Compagno, Fabio Massimo Zennaro

Reasoning & Chain-of-Thought Tool Use & Agents World Models & Planning

A. Saoulis +41w ago

Improving moment tensor solutions under Earth structure uncertainty with simulation-based inference

Gaussian assumptions about Earth structure introduce bias and significantly under-report moment tensor uncertainties, but simulation-based inference offers a robust alternative for more reliable earthquake source characterization.

A. Saoulis, A. A. Saoulis, T. -S. Pham +2

Scientific Discovery & Drug Design World Models & Planning

Songjia He +81w ago

V-Dreamer: Automating Robotic Simulation and Trajectory Synthesis via Video Generation Priors

Forget hand-crafted assets and heuristics: V-Dreamer uses video generation models to automatically create diverse, physically plausible robotic simulation environments and trajectories directly from language.

Songjia He, Songjiang He, Zixuan Chen +6

Computer Vision Robotics & Embodied AI World Models & Planning

Haohua Chen +31w ago

CSSDF-Net: Safe Motion Planning Based on Neural Implicit Representations of Configuration Space Distance Field

Differentiable collision checking in configuration space, previously a major hurdle, is now achievable with zero-shot generalization thanks to CSSDF-Net.

Haohua Chen, Yixuan Zhou, Yifan Zhou +1

Computer Vision Robotics & Embodied AI Training Efficiency & Optimization+1

Cong Wang +81w ago

PhysVideo: Physically Plausible Video Generation with Cross-View Geometry Guidance

Achieve more physically realistic video generation by explicitly modeling 3D geometry and physical attributes across multiple viewpoints.

Cong Wang, Hanxin Zhu, Xiao Tang +6

Computer Vision Multimodal Models World Models & Planning

NVIDIA1w ago

PRIOR: Perceptive Learning for Humanoid Locomotion with Reference Gait Priors

Humanoid robots can now traverse complex terrains with human-like gaits, thanks to a surprisingly simple and efficient framework that eschews adversarial training.

Chenxi Han, Shilu He, Yixiao Cheng +3

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Chenyang Gu +111w ago

Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

Achieve 9x lower trajectory error and 3x better FID in motion generation by using a diffusion-based discrete motion tokenizer that elegantly handles both semantic and kinematic constraints.

Chenyang Gu, Chenyang Gu, Mingyuan Zhang +9

Computer Vision Robotics & Embodied AI World Models & Planning

Jiacheng Tang +51w ago

CausalVAD: De-confounding End-to-End Autonomous Driving via Causal Intervention

Autonomous driving models can be made significantly more robust and safe by explicitly de-confounding their training via causal intervention, eliminating reliance on spurious correlations.

Jiacheng Tang, Zhiyuan Zhou, Zhuolin He +3

Computer Vision Robotics & Embodied AI World Models & Planning

1w ago

PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents

Stop leaking your secrets to the cloud: PlanTwin lets LLM agents plan over your private data without actually exposing it.

Guangsheng Yu, Qin Wang, Rui Lang +2

Tool Use & Agents World Models & Planning

Peter Stadler +51w ago

Lightweight Model Predictive Control for Spacecraft Rendezvous Attitude Synchronization

Ditch the heavyweight controllers: these lightweight MPC approaches bring real-time attitude synchronization to resource-constrained spacecraft.

Peter Stadler, Alexander Meinert, N. Baldauf +3

Robotics & Embodied AI World Models & Planning

Xiucheng Wang +21w ago

Learn for Variation: Variationally Guided AAV Trajectory Learning in Differentiable Environments

Differentiable environments and backpropagation offer a surprisingly effective alternative to reinforcement learning for AAV trajectory optimization, sidestepping credit assignment problems.

Xiucheng Wang, Zhenye Chen, Nan Cheng

Robotics & Embodied AI World Models & Planning

Yicheng Zeng +71w ago

ADMM-Based Distributed MPC with Control Barrier Functions for Safe Multi-Robot Quadrupedal Locomotion

Decentralized MPC with control barrier functions lets multi-robot quadrupeds safely navigate complex environments in real-time, achieving performance on par with centralized approaches but with significantly reduced computation.

Yicheng Zeng, Ruturaj S. Sambhus, B. Imran +5

Distributed Systems & Hardware Robotics & Embodied AI World Models & Planning

Mohammadhossein Homaei +91w ago·also University of Extremadura

Cyber-Resilient Digital Twins: Discriminating Attacks for Safe Critical Infrastructure Control

Digital twins can now discriminate between different types of cyberattacks on critical infrastructure, enabling targeted responses instead of costly full shutdowns.

Mohammadhossein Homaei, MohammadHossein Homaei, Iman Khazrak +7

Red-Teaming & Adversarial Robustness Robotics & Embodied AI World Models & Planning

Shuqi Xiao +31w ago

REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation

LLMs can navigate more efficiently in unfamiliar environments by reasoning over a tree of possible paths, not just isolated waypoints, enabling them to consider en-route information gain and prune unpromising branches.

Shuqi Xiao, Maani Ghaffari, Chengzhong Xu +1

Computer Vision Robotics & Embodied AI World Models & Planning

Sangwoo Shin +41w ago

Articulated-Body Dynamics Network: Dynamics-Grounded Prior for Robot Learning

Robots can learn faster and generalize better by encoding dynamics directly into their neural network architecture, outperforming standard transformers and GNNs.

Sangwoo Shin, Kunzhao Ren, Xiaobin Xiong +2

Architecture Design (Transformers, SSMs, MoE)Robotics & Embodied AI Training Efficiency & Optimization+1

Andrew Choi +31w ago

Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds

Forget painstakingly designing simulation environments: generative 3D world models let you RL-fine-tune robot VLAs with massive scene diversity, boosting real-world transfer by 3x.

Andrew Choi, Xinjie Wang, Zhizhong Su +1

Multimodal Models Robotics & Embodied AI World Models & Planning

Julián Martínez +21w ago

Computationally Efficient Density-Driven Optimal Control via Analytical KKT Reduction and Contractive MPC

Unlock real-time control for massive multi-agent swarms: this method slashes computation from cubic to linear with horizon length, making long-horizon density-driven control practical.

Julián Martínez, Julian Martinez, Kooktae Lee

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Fengxiaoxiao Li +81w ago

CAMO: A Conditional Neural Solver for the Multi-objective Multiple Traveling Salesman Problem

Neural solvers can now effectively handle the complexities of multi-agent coordination and multi-objective trade-offs in routing problems, outperforming traditional heuristics.

Fengxiaoxiao Li, Xiao Mao, Mingfeng Fan +6

Robotics & Embodied AI Tool Use & Agents World Models & Planning

Xianjin Wu +121w ago

Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding

MLLMs can gain surprisingly strong 3D spatial reasoning abilities simply by tapping into the latent knowledge already present in video generation models.

Xianjin Wu, Xian Wu, Dingkang Liang +10

Computer Vision Multimodal Models World Models & Planning

Xuemian Wu +21w ago

Conflict-Based Search for Multi Agent Path Finding with Asynchronous Actions

Optimal multi-agent path planning with asynchronous actions is now provably complete, sidestepping the theoretical incompleteness of prior continuous-time approaches.

Xuemian Wu, Shizhe Zhao, Zhongqiang Ren

Robotics & Embodied AI World Models & Planning

Alexander Meinert +51w ago

Safety-Guaranteed Imitation Learning from Nonlinear Model Predictive Control for Spacecraft Close Proximity Operations

Guaranteeing safety in spacecraft autonomy is now more tractable: a CBF-CLF informed imitation learning approach achieves NMPC-level performance with real-time feasibility on commodity hardware.

Alexander Meinert, Niklas Baldauf, N. Baldauf +3

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Yiren Lu +51w ago

GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning

Agents can now "hallucinate" optimal viewpoints for reasoning by storing and re-rendering scenes with 3D Gaussian Splatting, enabling recovery from initial observation failures.

Yiren Lu, Yi Du, Disheng Liu +3

Computer Vision Robotics & Embodied AI World Models & Planning

†Corresponding author1w ago

MemoAct: Atkinson-Shiffrin-Inspired Memory-Augmented Visuomotor Policy for Robotic Manipulation

Hierarchical memory, inspired by human cognition, beats standard approaches in robotic manipulation tasks requiring both precise tracking and long-term retention.

Liufan Tan, Jiale Li, Gang Jing

Multimodal Models Robotics & Embodied AI World Models & Planning

NVIDIA1w ago·also CAS, PKU, Zhongguancun Academy

OmniVTA: Visuo-Tactile World Modeling for Contact-Rich Robotic Manipulation

Robots can now manipulate objects with greater dexterity and adaptability thanks to a new world model that leverages both vision and high-frequency tactile feedback to predict and react to contact dynamics.

Yuhang Zheng, Songen Gu, Weize Li +13

Multimodal Models Robotics & Embodied AI World Models & Planning

Yifan Zhang +21w ago

RE-SAC: Disentangling aleatoric and epistemic risks in bus fleet control: A stable and robust ensemble DRL approach

Standard DRL collapses in volatile environments because it mistakes irreducible noise for a lack of data, but RE-SAC fixes this by explicitly separating these uncertainties.

Yifan Zhang, Yifan Zhang, Liang Zheng

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Anton R. Wagner +61w ago

Fire as a Service: Augmenting Robot Simulators with Thermally and Visually Accurate Fire Dynamics

Robots can now train in realistic, thermally-accurate simulated fires, paving the way for safer and more reliable real-world firefighting deployments.

Anton R. Wagner, A. Wagner, Madhan Balaji Rao +4

Computer Vision Robotics & Embodied AI World Models & Planning

Mar 18, 2026

Lars Bartels +42w ago

Real-Time Online Learning for Model Predictive Control using a Spatio-Temporal Gaussian Process Approximation

Achieve real-time online learning for model predictive control with a novel spatio-temporal Gaussian Process approximation that maintains constant computational complexity.

Lars Bartels, Amon Lahr, Andrea Carron +2

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Tsinghua AI2w ago·also PKU

Recurrent Reasoning with Vision-Language Models for Estimating Long-Horizon Embodied Task Progress

By iteratively reasoning over video snippets with a Chain-of-Thought, $\text{R}^2$VLM achieves state-of-the-art long-horizon task progress estimation without needing to process entire videos at once.

Yuelin Zhang, Sijie Cheng, Zongzhao Li +2

Multimodal Models Robotics & Embodied AI World Models & Planning

2w ago

From Digital Twins to World Models:Opportunities, Challenges, and Applications for Mobile Edge General Intelligence

Ditching rigid digital twins for adaptable world models could unlock truly intelligent edge computing in 6G networks.

Dusit Niyato, Changyuan Zhao, Jiawen Kang +1

Robotics & Embodied AI Tool Use & Agents World Models & Planning

Seongrae Noh +32w ago

Edit-As-Act: Goal-Regressive Planning for Open-Vocabulary 3D Indoor Scene Editing

By treating 3D scene editing as goal-regressive planning rather than pure generation, Edit-As-Act achieves instruction fidelity, semantic consistency, and physical plausibility that existing methods miss.

Seongrae Noh, SeungWon Seo, Gyeong-Moon Park +1

Computer Vision Robotics & Embodied AI World Models & Planning

Abhijeet M. Kulkarni +42w ago

Proprioceptive-only State Estimation for Legged Robots with Set-Coverage Measurements of Learned Dynamics

Legged robots can navigate more reliably with noisy sensors thanks to a new state estimator that avoids Gaussian noise assumptions.

Abhijeet M. Kulkarni, Abhijeet M. Kulkarni, Ioannis Poulakakis +2

Robotics & Embodied AI World Models & Planning

Chaokang Jiang +32w ago

VectorWorld: Efficient Streaming World Model via Diffusion Flow on Vector Graphs

Achieve stable, real-time kilometer-scale autonomous driving simulations by generating vector-graph tiles incrementally using a novel diffusion flow approach.

Chaokang Jiang, Desen Zhou, Jiuming Liu +1

Computer Vision Robotics & Embodied AI World Models & Planning

Petros Ellinas +32w ago

Verification and Validation of Physics-Informed Surrogate Component Models for Dynamic Power-System Simulation

Seemingly accurate physics-informed surrogates can fail spectacularly when integrated into power system simulations, especially under stress, highlighting the need for rigorous in-simulator validation.

Petros Ellinas, Indrajit Chaudhuri, Johanna Vorwerk +1

Scientific Discovery & Drug Design World Models & Planning

Yang-Tian Sun +62w ago

Stereo World Model: Camera-Guided Stereo Video Generation

Generate consistent stereo videos directly from RGB data, bypassing depth estimation and monocular-to-stereo conversion, with StereoWorld's novel camera-aware attention mechanisms.

Yang-Tian Sun, Zehuan Huang, Yifan Niu +4

Architecture Design (Transformers, SSMs, MoE)Computer Vision World Models & Planning

Jinyu Miao +122w ago

Physics-informed Deep Mixture-of-Koopmans Vehicle Dynamics Model with Dual-branch Encoder for Distributed Electric-drive Trucks

Representing highly nonlinear vehicle dynamics in a lifted linear space via Koopman operator theory enables state-of-the-art long-term state estimation for complex electric trucks.

Jinyu Miao, Pu Zhang, Rujun Yan +10

Robotics & Embodied AI World Models & Planning

Yaozhong Shi +72w ago

Large-Scale 3D Ground-Motion Synthesis with Physics-Inspired Latent Operator Flow Matching

Simulate earthquake ground motion 10,000x faster with a new latent operator flow matching method, opening the door to real-time risk assessment for critical infrastructure.

Yaozhong Shi, Grigorios Lavrentiadis, Konstantinos Tsalouchidis +5

Scientific Discovery & Drug Design World Models & Planning

Xinyang Gong +52w ago

ShuttleEnv: An Interactive Data-Driven RL Environment for Badminton Strategy Modeling

Forget rigid physics engines, this badminton RL environment uses real player data to simulate realistic rallies and strategic gameplay.

Xinyang Gong, Bozhou Chen, Yunlong Lu +3

Robotics & Embodied AI Tool Use & Agents World Models & Planning

MIT CSAIL2w ago

Physics-informed offline reinforcement learning eliminates catastrophic fuel waste in maritime routing

Heuristic maritime routes lead to extreme fuel waste in nearly 5% of voyages, but this RL approach cuts that risk by almost 10x.

Aniruddha Bora, J. Chalfant, C. Chryssostomidis

Robotics & Embodied AI Scientific Discovery & Drug Design World Models & Planning

CMU ML2w ago·also NII

RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy

LLMs in embodied environments get a massive boost from structured rules, with rule retrieval alone contributing +14.9 pp to single-trial success.

Zhenhang Yuan, Shenghai Yuan, Lihua Xie

Robotics & Embodied AI Tool Use & Agents World Models & Planning

2w ago

P$^{3}$Nav: End-to-End Perception, Prediction and Planning for Vision-and-Language Navigation

VLN agents can navigate more effectively by predicting their future states and proactively planning based on forecasted semantic map cues, rather than relying solely on historical context.

Tian Li, Tianfu Li, Wenbo Chen +4

Multimodal Models Robotics & Embodied AI World Models & Planning

Stanford HAI2w ago

Rapid Adaptation of Particle Dynamics for Generalized Deformable Object Mobile Manipulation

Encoding deformable object dynamics with particle positions unlocks sim-to-real transfer for manipulation tasks, achieving impressive real-world success rates.

Bohan Wu, Roberto Mart'in-Mart'in

Robotics & Embodied AI World Models & Planning

2w ago

SafeLand: Safe Autonomous Landing in Unknown Environments with Bayesian Semantic Mapping

Drones can now land safely in complex, unknown environments using only a camera, thanks to a new system that dynamically maps and reacts to surroundings in real-time.

Markus Gross, Andreas Greiner, S. B. Matha +5

Computer Vision Robotics & Embodied AI World Models & Planning

NVIDIA2w ago

Generative Control as Optimization: Time Unconditional Flow Matching for Adaptive and Robust Robotic Control

Ditch fixed compute budgets: this new flow-matching method for robotic control adaptively allocates computation, speeding up simple tasks and focusing on complex ones.

Zunzhe Zhang, Runhan Huang, R. Huang +4

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Gaotian Wang +32w ago

ManiDreams: An Open-Source Library for Robust Object Manipulation via Uncertainty-aware Task-specific Intuitive Physics

ManiDreams lets robots handle real-world uncertainty in manipulation tasks without retraining, outperforming standard RL baselines under various perturbations.

Gaotian Wang, Kejia Ren, Andrew S. Morgan +1

Open-Source Models & Weights Robotics & Embodied AI World Models & Planning

Ruixiang Wang +52w ago

EVA: Aligning Video World Models with Executable Robot Actions via Inverse Dynamics Rewards

Robot world models can be significantly improved by directly rewarding them for generating videos that lead to physically plausible robot actions, even if the videos themselves contain visual artifacts.

Ruixiang Wang, Qingming Liu, Yueci Deng +3

Computer Vision Robotics & Embodied AI World Models & Planning

Adam Dai +62w ago

Full Stack Navigation, Mapping, and Planning for the Lunar Autonomy Challenge

A complete autonomy stack enables centimeter-level localization and mapping on the moon, even without GPS.

Adam Dai, Asta Wu, Keidai Iiyama +4

Computer Vision Robotics & Embodied AI World Models & Planning

Sinan Ibrahim +52w ago·also Research Center for Digital Engineering

Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies

Finally, a rigorous RL benchmark: generate environments with *provably* optimal policies, enabling controlled algorithm evaluation against ground truth.

Sinan Ibrahim, Grégoire Ouerdane, Hadi Salloum +3

Eval Frameworks & Benchmarks Robotics & Embodied AI World Models & Planning

2w ago

End-to-end data-driven prediction of urban airflow and pollutant dispersion

Accurately predict urban pollutant dispersion in real-time with a novel data-driven model that's orders of magnitude faster than traditional CFD.

Nishant Kumar, Franck Kerhervé, Lionel Agostini +1

Scientific Discovery & Drug Design World Models & Planning

Felix Schur2w ago

Identifying Latent Actions and Dynamics from Offline Data via Demonstrator Diversity

Demonstrator diversity unlocks the ability to learn latent actions and dynamics from offline RL data, even without explicit action labels.

Felix Schur

Robotics & Embodied AI World Models & Planning

Yusen Wu +22w ago

MALLES: A Multi-agent LLMs-based Economic Sandbox with Consumer Preference Alignment

LLMs can be economically aligned to real-world consumer preferences via post-training on transaction data, enabling more accurate and stable economic simulations.

Yusen Wu, Yiran Liu, Xiaotie Deng

Tool Use & Agents World Models & Planning

2w ago·also COWARobot Co. Ltd, Hohai

VisionNVS: Self-Supervised Inpainting for Novel View Synthesis under the Virtual-Shift Paradigm

By cleverly turning novel view synthesis into a self-supervised inpainting problem, VisionNVS eliminates the need for ground truth images of novel views, outperforming LiDAR-dependent baselines.

Hongbo Lu, Chenghao He, Fan Liu +3

Computer Vision Robotics & Embodied AI World Models & Planning

DeepMind2w ago

Versatile Editing of Video Content, Actions, and Dynamics without Training

Forget finetuning: DynaEdit unlocks complex video edits like action modification and object insertion, all without training, using clever manipulation of pretrained text-to-video models.

Vladimir Kulikov, Roni Paiss, Andrey Voynov +3

Computer Vision Multimodal Models World Models & Planning

Computational Neuroscience Unit2w ago·also Ospedale Santa Lucia, Sheffield

Unified Policy Value Decomposition for Rapid Adaptation

Achieve zero-shot adaptation to new tasks in complex control environments by learning a shared low-dimensional goal embedding that unifies policy and value function representations.

Cristiano Capone, Luca Falorsi, Andrea Ciardiello +1

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Adam Dai +22w ago

Neural Radiance Maps for Extraterrestrial Navigation and Path Planning

NeRFs can now guide extraterrestrial rovers around unexpected obstacles, thanks to a novel planning framework that blends local observations with global terrain understanding.

Adam Dai, Shubh Gupta, Grace Gao

Computer Vision Robotics & Embodied AI World Models & Planning

Nicola J. Müller +42w ago

Per-Domain Generalizing Policies: On Learning Efficient and Robust Q-Value Functions (Extended Version with Technical Appendix)

Q-value policies, traditionally outperformed by state-value policies in planning, can surpass them with the right regularization, offering a faster alternative for policy evaluation.

Nicola J. Müller, Moritz Oster, Isabel Valera +2

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Angen Ye +232w ago

GigaWorld-Policy: An Efficient Action-Centered World--Action Model

Robots can now plan 9x faster and achieve significantly higher success rates by decoupling action prediction from video generation in World-Action Models.

Angen Ye, Boyuan Wang, Chaojun Ni +21

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Tsinghua AI2w ago

Multi-Source Human-in-the-Loop Digital Twin Testbed for Connected and Autonomous Vehicles in Mixed Traffic Flow

A new mixed reality testbed lets you plug real human drivers into a CAV simulation, offering unprecedented realism for testing autonomous vehicle interactions.

Jianghong Dong, Jiawei Wang, Chunying Yang +5

Robotics & Embodied AI World Models & Planning

2w ago

Specification-Aware Distribution Shaping for Robotics Foundation Models

Guaranteeing robot safety and task completion just got easier: this method enforces complex temporal logic constraints on pre-trained robotics models without any fine-tuning.

Sadık Bera Yüksel, Sadik Bera Yuksel, Derya Aksaray

Natural Language Processing Robotics & Embodied AI World Models & Planning

Tsinghua AI2w ago

From Optimizable to Interactable: Mixed Digital Twin-Empowered Testing of Vehicle-Infrastructure Cooperation Systems

Human unpredictability is now a feature, not a bug: a mixed-reality testing framework leverages human interaction to generate high-quality corner cases for vehicle-infrastructure cooperation systems.

Jianghong Dong, Chunying Yang, Mengchi Cai +4

Robotics & Embodied AI World Models & Planning

2w ago

Towards Infinitely Long Neural Simulations: Self-Refining Neural Surrogate Models for Dynamical Systems

Autoregressive neural surrogates can now simulate dynamical systems for infinitely long horizons, thanks to a novel self-refining diffusion model that avoids error compounding.

Qi Liu, Laure Zanna, Joan Bruna

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization World Models & Planning

Meta AI2w ago

R2-Dreamer: Redundancy-Reduced World Models without Decoders or Augmentation

Ditch the data augmentation and decoders: R2-Dreamer's Barlow Twins-inspired objective delivers faster, more versatile MBRL, especially when spotting the small stuff matters.

N. Morihira, Amal Nahar, K. Bharadwaj +6

Data Curation & Synthetic Data Training Efficiency & Optimization World Models & Planning