Search papers, labs, and topics across Lattice.
21 papers from Berkeley AI Research (BAIR) on Robotics & Embodied AI
Running robotic manipulation workloads entirely onboard kills robot batteries, but offloading to the cloud tanks accuracy due to network latency, revealing a critical compute placement trade-off.
Current AI's hunger for curated data may be solved by a new architecture inspired by human cognition that flexibly switches between observation, active behavior, and meta-control.
Teaching robots to manipulate objects just got easier: OCRA learns directly from human demonstration videos by focusing on object interactions and incorporating tactile feedback.
Ditch the clunky controllers: this hand-shadowing pipeline lets you teleoperate a robot arm with just an RGB-D camera and some clever inverse kinematics.
Train one RL agent to handle a whole family of reward functions, unlocking robust and adaptable policies without the complexity of multi-task training.
Robots can now remember what they've done and what they need to do next for 15 minutes straight, thanks to a new memory architecture that mixes video and text.
LLMs can now generate more accurate and complex CAD models by pointing to existing geometric entities, rather than relying on discretized command sequences prone to topological errors.
Forget simulated manipulation—ManipulationNet offers a global infrastructure for benchmarking robots in the real world, complete with standardized hardware and software, to finally measure progress toward general manipulation.
Robots can now peel cucumbers, apples, and potatoes with 90% success by learning from a small number of human preferences, even generalizing to new produce types.
Achieve globally consistent 3D reconstruction over sequences exceeding 19,000 frames by combining test-time training with sliding window attention, outperforming prior state-of-the-art methods by over 74% on ATE on KITTI.
Forget tedious manual calibration: D-REX automatically builds high-fidelity digital twins by identifying object mass directly from real-world grasping data.
Unlock autonomous driving with YouTube: a new label-free pretraining method learns driving representations directly from unposed in-the-wild videos, outperforming LiDAR baselines with only a single monocular camera.
Ditching explicit 3D geometry, RAYNOVA achieves SOTA multi-view video generation by modeling spatio-temporal relationships directly with a dual-causal autoregressive framework and Plücker-ray positional encoding.
You can now train autonomous driving VLAs on 60% less data and without any reasoning annotations, thanks to a fix for difficulty bias in Group Relative Policy Optimization.
Forget synthetic data—scaling up human egocentric video by 20x unlocks surprisingly effective dexterous robot manipulation, even transferring to robots with different hand configurations.
Humanoid robots can now perform vision-based parkour, chaining together dynamic skills like climbing, vaulting, and rolling, adapting to real-time obstacle changes.
Autonomous driving benchmarks get a reality check: ScenicRules exposes failures by combining prioritized, multi-objective rules with formally modeled, stochastic scenarios.
Forget clunky skeletons: this new model lets you prompt your way to accurate 3D human meshes from single images, even in the wildest poses.
Key contribution not extracted.
Escape stochastic robotic systems' safety limitations with EigenSafe, a spectral method that learns a safety filter from the dominant eigenpair of a dynamic programming operator.
An end-to-end learned robotic system can now clean your kitchen in a completely new house, thanks to a novel co-training approach on diverse data.