Stanford HAI

×Robotics & Embodied AI

16 papers from Stanford HAI on Robotics & Embodied AI

Apr 21, 2026

Stanford HAIApr 21, 2026

FASTER: Value-Guided Sampling for Fast RL

Get the performance boost of expensive sampling-based RL policies for a fraction of the compute by learning to prune action candidates early in the diffusion denoising process.

Perry Dong, Alexander Swerdlow, Dorsa Sadigh +1

Robotics & Embodied AI Training Efficiency & Optimization

Stanford HAIApr 21, 2026·also core contributors, project leads and equal contributions

CityRAG: Stepping Into a City via Spatially-Grounded Video Generation

Generate navigable, 3D-consistent simulations of real-world locations with arbitrary weather and dynamic object configurations using only geo-registered video data.

Gene Chou, Charles Herrmann, Kyle Genova +6

Computer Vision Multimodal Models Robotics & Embodied AI+1

Apr 20, 2026

Stanford HAIApr 20, 2026

Will People Enjoy a Robot Trainer? A Case Study with Snoopie the Pacerbot

Runners stick to their pace 60% better and enjoy the workout more when coached by a robot dog than when using an Apple Watch.

Maximilian Du, Jennifer Grannen, J. Grannen +2

Robotics & Embodied AI

Apr 8, 2026

Stanford HAIApr 8, 2026·also Meta AI, MIT CSAIL, D [36], Georgia Tech +4

EgoVerse: An Egocentric Human Dataset for Robot Learning from Around the World

Scaling robot learning with human data isn't a simple "more is better" equation; alignment with robot learning objectives is key.

Ryan Punamiya, Simar Kareer, Zeyi Liu +30

Computer Vision Data Curation & Synthetic Data Robotics & Embodied AI

Apr 7, 2026

ETHApr 7, 2026·also Microsoft Research, Stanford HAI, Max Planck, USI Lugano

FunRec: Reconstructing Functional 3D Scenes from Egocentric Interaction Videos

Unlock interactive digital twins from messy, real-world videos: FunRec automatically turns egocentric RGB-D recordings into simulation-ready 3D scenes.

Rishabh Dabral, Leonidas Guibas, Francis Engelmann +1

Computer Vision Robotics & Embodied AI

Mar 17, 2026

Stanford HAIMar 17, 2026·also Sydney

Stochastic Resetting Accelerates Policy Convergence in Reinforcement Learning

Stochastic resetting—randomly teleporting RL agents back to the start—surprisingly speeds up learning, even when it wouldn't help a non-learning agent.

Jello Zhou, Vudtiwat Ngampruetikorn, David J. Schwab

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Mar 12, 2026

Mar 12, 2026·also NVIDIA, Stanford HAI

Diversity You Can Actually Measure: A Fast, Model-Free Diversity Metric for Robotics Datasets

Forget slow, model-dependent curation: FAKTUAL offers a fast, model-free way to boost robot imitation learning by directly maximizing the entropy of demonstration datasets.

Sreevardhan Sirigiri, N. D. Lara, Christopher Agia +2

Data Curation & Synthetic Data Robotics & Embodied AI

Mar 9, 2026

Mar 9, 2026·also Stanford HAI

On the Feasibility and Opportunity of Autoregressive 3D Object Detection

Ditch the anchors and NMS: AutoReg3D reimagines 3D object detection as a sequence generation problem, opening the door for language-model techniques in 3D perception.

Zanming Huang, Jinsu Yoo, Sooyoung Jeon +5

Architecture Design (Transformers, SSMs, MoE)Computer Vision Robotics & Embodied AI

Mar 4, 2026

Mar 4, 2026·also Stanford HAI, PI

RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

Turns out, the best memory design for robotic manipulation depends heavily on the task, with no single architecture dominating across the board.

Yinpei Dai, Hongze Fu, H. Fu +11

Eval Frameworks & Benchmarks Multimodal Models Robotics & Embodied AI

Mar 1, 2026

Mar 1, 2026·also Stanford HAI

Minimalist Compliance Control

Unlock compliant robot control without force sensors or complex learning, using only motor signals already available in most modern robots.

Haochen Shi, Weizhuo Wang, Karen Liu

Robotics & Embodied AI Training Efficiency & Optimization

Feb 25, 2026

Feb 25, 2026·also Stanford HAI

UniHand: A Unified Model for Diverse Controlled 4D Hand Motion Modeling

By unifying hand motion estimation and generation into a single diffusion framework, UniHand handles heterogeneous inputs and challenging conditions like occlusions better than task-specific models.

Zhihao Sun, Ruirui Tu

Computer Vision Multimodal Models Robotics & Embodied AI

Feb 22, 2026

Feb 22, 2026·also BAIR, Stanford HAI, ETH Zurich, FieldAI Inc. 3 Morgan

WildOS: Open-Vocabulary Object Search in the Wild

Robots can now navigate complex outdoor environments and find objects using natural language queries, even without prior maps or precise depth sensing.

Hardik Shah, Erica Tevere, Deegan Atha +4

Computer Vision Robotics & Embodied AI Tool Use & Agents

Feb 18, 2026

Stanford HAIFeb 18, 2026

SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation

A single RL policy trained on procedurally generated tools in simulation can achieve zero-shot dexterous manipulation of diverse real-world tools, rivaling task-specific policies.

K. Kedia, Kushal Kedia, Tyler Ga Wei Lum +5

Robotics & Embodied AI Tool Use & Agents World Models & Planning

Feb 12, 2026

Stanford HAIFeb 12, 2026·also Google Research

Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment

Verification at test time can be a surprisingly effective alternative to scaling policy learning for vision-language-action alignment, yielding substantial gains in both simulated and real-world robotic tasks.

Jacky Kwok, Xilun Zhang, Azalia Mirhoseini +2

Computer Vision Multimodal Models Robotics & Embodied AI

Dec 1, 2025

Stanford HAIDec 1, 2025

Long-Reach Robotic Manipulation for Assembly and Outfitting of Lunar Structures

A deployable robotic arm achieves sub-15mm accuracy at 1.8m reach, opening the door to autonomous lunar construction.

Stanley J. Wang, Venny Kojouharov, Long Yin Chung +2

Robotics & Embodied AI

May 12, 2025

Stanford HAIMay 12, 2025

What Matters for Batch Online Reinforcement Learning in Robotics?

Q-functions and implicit policy extraction are game-changers for batch online RL in robotics, unlocking significant performance gains over imitation-based approaches.

Perry Dong, Suvir Mirchandani, Dorsa Sadigh +17

Robotics & Embodied AI Training Efficiency & Optimization