Search papers, labs, and topics across Lattice.
Robot learning, embodied agents, manipulation, locomotion, and sim-to-real transfer with foundation models.
#4 of 24
0
Train drone operators in realistic battlefield environments without ever leaving the simulator, thanks to Unreal Engine's built-in AI.
Forget hand-crafted rewards: MotionVL uses VLMs and LLMs to automatically generate task-aligned reward functions for humanoid robot RL, leading to more human-like and robust motion.
Robots get a 33% speed boost and become significantly more adaptable when you let LLMs handle the reasoning and RL handle the movements.
Pythonistas rejoice: aggregate programming, a powerful paradigm for distributed systems, finally gets a first-class, easy-to-use library in your favorite language.
Autonomous vehicles can drive more safely and reliably by grounding LLM reasoning in a "Commonsense World" that quantifies and leverages the trustworthiness of LLM outputs.
Achieve superhuman robot dexterity with 10x fewer demonstrations by decoupling intent and action through latent world modeling.
Automating scientific discovery is now more accessible: Owl-AuraID navigates proprietary GUIs to control diverse precision instruments, freeing researchers from tedious manual operation.
Achieve real-time, privacy-aware action detection on edge devices by intelligently fusing fast skeleton tracking with vision-language models, outperforming either approach alone.
Robots can now generalize to unseen objects and categories for manipulation tasks with only a few training examples, thanks to a novel retrieval-augmented affordance prediction framework.
Emulating human movement with 700 muscles reveals that many different control strategies can produce the same observed motion, challenging the assumption that kinematics uniquely define muscle activation.
Smart industrial systems, while promising increased efficiency, introduce unforeseen interoperability side-effects and heightened vulnerability to cyber threats across heterogeneous IIoT systems.
Robots can now learn to reproduce oil paintings with impressive accuracy through self-play and learned dynamics, even without human demonstrations or high-fidelity simulators.
Physical AI systems struggle not with visual recognition, but with understanding space, physics, and action – and PRISM, a new retail video dataset, dramatically closes this gap.
Assistive robots aren't just vulnerable to data breaches; they can be hacked to physically harm the very people they're supposed to protect.
Open-source SurgNavAR slashes the barrier to entry for AR surgical navigation research, offering a ready-to-use framework adaptable to diverse surgical applications.
Synthetic data, when carefully aligned with real-world characteristics, can boost hand-object interaction detection by over 11% even when real labeled data is scarce.
Vision-language models falter at the fine-grained temporal recognition crucial for surgical video understanding, while SurgRec excels.
Even state-of-the-art VLMs exhibit systematic failures in reasoning about the physical feasibility of actions in 3D environments, despite high semantic confidence.
Forget expensive labels: CoRe-DA leverages contrastive learning and self-training to achieve state-of-the-art surgical skill assessment across diverse surgical environments without requiring target domain annotations.
Surgeons can now pinpoint tumor margins with millimeter precision using augmented reality, potentially reducing positive margins in head and neck cancer resections.
Ditching depth map projections for camera-LiDAR calibration unlocks significant gains in accuracy and robustness, especially when starting from poor initial extrinsic estimates.
Quantifying and integrating map uncertainty—both positional and semantic—into trajectory prediction pipelines significantly boosts forecast accuracy, even when using existing baseline models.
LLMs can generate more accurate motion trajectories by clustering them into geometrically consistent families, even without retraining.
Achieve a 60% reduction in trajectory error for monocular SLAM by tightly integrating multi-task dense prediction with a compact perception-to-mapping interface.
Reconstructing dynamic 3D scenes from video just got a whole lot better: MotionScale achieves state-of-the-art fidelity and temporal stability by scaling Gaussian splatting to long, complex sequences.
Forget tedious optimization – LightHarmony3D generates realistic lighting and shadows for inserted 3D objects in a single pass, making scene augmentation feel truly real.
Turn 2D orthographic views into 3D models automatically using corner detection and geometric reconstruction.
Unlock adaptable human augmentation in everyday environments with reconfigurable robotic limbs, guided by quantitative analysis of workspace extension and human-robot collaboration.
A rotating haptic compass on your wrist dramatically improves robotic teleoperation by providing intuitive directional cues, outperforming traditional vibration-based feedback and even improving imitation learning.
You can halve the polygon count of dynamic 3D meshes in VR without users noticing, but existing quality metrics won't tell you that.
Passive iFIR filters learned from just three minutes of robot data can dramatically outperform optimized PID controllers in velocity tracking tasks, offering a fast and stable alternative for robot control.
Get provably safe and dynamically robust robot motions in human environments without the computational bottleneck of online optimization.
Unlock rapid UAV design iteration with MetaMorpher's modular, nonlinear flight dynamics model that accurately simulates diverse wing configurations and flight modes.
Semantic scene understanding can keep your robot from crashing when running LLMs on edge devices.
A long-reach robot arm can gently clean lunar solar panels, even with limited force feedback, opening the door to autonomous maintenance on the moon.
Guaranteeing safety in multi-agent systems with dynamic networks doesn't have to sacrifice performance: this plug-and-play protocol ensures recoverable safety even when agents join/leave or network topologies shift.
Offline RL can now tackle complex, unseen temporal logic tasks without retraining, by stitching together learned short-horizon behaviors into long-horizon plans.
UUVs can navigate communication blackouts with 91% more accuracy by distilling patterns from their past trajectories.
By optimizing PID gains with MPPI, this method achieves comparable performance to conventional MPPI with significantly fewer samples, offering a more sample-efficient approach to learning-based control.
Humanoids can now nimbly navigate real-world clutter and complex terrain using only raw depth data, ditching hand-engineered geometric representations.
Achieve state-of-the-art robotic manipulation with a model orders of magnitude smaller than VLAs by explicitly aligning kinematic and semantic transitions.
Forget brute-force coverage – this method learns from simulated expert guidance to prioritize semantically relevant areas, dramatically speeding up target search in unseen environments.
Legged robots can now navigate more accurately using only internal sensors, even with imperfect foot contact, thanks to a new probabilistic method that dynamically adapts to different contact scenarios.
Automating disassembly of complex, degraded appliances in recycling plants is now feasible, achieving high accuracy without pre-programmed coordinates.
SuperGrasp achieves robust single-view grasping by cleverly combining superquadric-based similarity matching with an end-to-end refinement network, outperforming existing methods in stability and generalization.
Real-time, uncertainty-aware signed distance functions are now possible without sacrificing accuracy, thanks to a novel kernel regression and GP regression hybrid.
Get kilohertz-level dexterous hand teleoperation *with* formal safety guarantees, thanks to a new convex optimization approach.
Policies trained with GenSplat maintain robust performance under severe spatial perturbations where baseline methods completely fail, thanks to its novel 3D Gaussian Splatting-based augmentation.
VLN agents can now "dream ahead" by learning action-conditioned visual dynamics in a latent space, leading to SOTA results and improved real-world navigation.
Ignoring control packet loss in drone communication can lead to trajectory divergence, but this integrated sensing-communication-control scheme achieves decimeter-level accuracy.