Search papers, labs, and topics across Lattice.
Robot learning, embodied agents, manipulation, locomotion, and sim-to-real transfer with foundation models.
#19 of 24
1
Interactive 3D asset generation can now be driven by functional logic and hierarchical physics, thanks to a new framework that synthesizes simulation-ready assets.
Stop committing to a single policy in offline-to-online RL: adaptively select and fine-tune policies based on predicted performance to maximize returns under interaction budgets.
Stabilizing nuclear fusion plasma with imitation learning is possible even with limited macroscopic observations, offering a path to practical control strategies.
Drivers dynamically switch their perceptual priorities from gap-closing rate to visual looming as braking intensity decreases, overturning long-held assumptions about car-following behavior.
Task-aware 3D reconstruction slashes the number of views needed by focusing on the data that actually matters for downstream applications.
You don't need a full causal graph to avoid undesired outcomes; learning a simple order structure can be enough, and even outperform methods that try to learn the whole graph.
Reservoir Computing offers a surprisingly effective way to build Koopman dictionaries for nonlinear system identification, sidestepping the usual dictionary selection and ill-conditioning problems.
Average reward RL can finally handle the messy reality of non-stationary rewards and durations in SMDPs, thanks to a clever harmonic mean trick.
Control-dependent latent dynamics, achieved with a surprisingly small parameter increase, unlock robust MPC performance in time-varying environments where standard Koopman methods falter.
Turns out, all gaze estimation models stumble when robots look down, and complex architectures aren't the answer – data diversity is the real secret to robust human-robot interaction.
MoEs, despite their scaling advantages, suffer from a surprising "spectral plasticity loss" in continual RL, but a simple Parseval penalty can recover performance.
Achieve robust long-horizon visual control by adaptively balancing model-based lookahead with bootstrapping, enabling zero-shot transfer to real-world tasks with severe occlusions.
Decomposing robot swarm state representations unlocks effective cooperation even with computationally-limited agents.
Forget brittle imitation learning: Q2RL unlocks robust on-robot reinforcement learning by distilling a Q-function from Behavior Cloning and intelligently gating between imitation and RL based on Q-value estimates.
Forget hand-crafted reward functions: this RL framework lets a bicycle robot learn complex stunts from just a spatial guideline and a few key poses.
Predicting driver behavior in response to traffic conditions is now possible with a new world model that causally links external context to internal driver states.
End-to-end ML models get smoked in real-world mmWave vehicular connectivity: a hybrid vision-primed approach slashes outage rates by leveraging model-based reasoning and RF feedback.
Fragmented privacy patches are insufficient for Embodied AI: a unified, lifecycle-level approach is needed to prevent systemic privacy leakage in real-world deployments.
LLM-powered multi-agent collaboration can boost zero-shot IMU activity recognition accuracy by 29% compared to existing agent models, even surpassing deep learning baselines.
Gradient-based MPC can finally beat gradient-free methods in continuous control, thanks to Dream-MPC's clever combination of learned policies, world models, uncertainty regularization, and optimization amortization.
DAOs could unlock a new era of human-machine collaboration by democratizing the operation and governance of physical-digital systems.
Unlock zero-shot 3D scene understanding: Ilov3Splat lets you identify and segment arbitrary objects in 3D scenes using only natural language, no category supervision needed.
Mixing tasks with different safety levels in automotive ECUs can compromise critical functions, highlighting the need for careful task allocation strategies.
Guaranteeing safety in autonomous systems gets a boost: this work enables formal verification of hybrid system code that directly controls physical processes.
Get high-fidelity tactile simulations with 65% speedup and 40% less memory by combining coarse physics with neural implicit reconstruction.
By intelligently incorporating LiDAR-derived height information, HiPR overcomes limitations of fixed projection spaces, achieving state-of-the-art camera-LiDAR occupancy prediction with real-time performance.
Finally, a driving dataset that doesn't just assume perfectly paved roads, offering 6.5x more depth data than KITTI for realistic autonomous driving scenarios.
Adult-trained human mesh recovery models can now handle kids, too, thanks to a new framework that enforces spatial consistency and leverages VLM-derived age and gender cues.
Synthesizing realistic duet dance motions gets a boost from explicitly modeling inter-dancer contact, leading to significantly improved interaction fidelity and rhythmic synchronization.
Bridging the gap between aerial and ground-level tracking, VL-UniTrack uses visual-language prompts to achieve robust object tracking even with significant viewpoint differences.
Radar SLAM can now achieve state-of-the-art performance via direct scan registration, eliminating the need for hand-engineered feature extraction and enabling robust localization in adverse weather.
Autonomous driving gets a boost: CRAFT cleverly combines the best of both worlds – dense counterfactual supervision and grounded closed-loop feedback – to significantly improve driving policies.
Robots can reliably hand over objects to humans by actively probing grasps, achieving a 30% improvement over passive methods.
RL fine-tuning unlocks a 6x performance gain for in-place trajectory editing in autonomous driving, demonstrating the power of aligning diffusion planners with reinforcement learning.
Stop relying on significance tests that only find differences: this Bayesian framework tells you if your synthetic data is *practically equivalent* to real-world data for your specific safety assessment task.
Tactile feedback, when strategically sampled and evaluated, unlocks robust and safe robotic insertion policies even under sub-millimeter tolerances.
Achieve real-time bipedal walking control by cleverly swapping high-fidelity for low-fidelity models in MPC, slashing computation without sacrificing stability.
Diffusion models can now plan effectively for long-horizon tasks by strategically generating subgoals that are then efficiently realized by rectified flow models.
Generate more realistic and diverse safety-critical autonomous vehicle scenarios by using conditional latent flow matching to bridge the gap between real-world and simulated data.
Dynamically adjusting trajectory optimization based on real-time navigation confidence enables robust low-thrust rendezvous, slashing miss distances by two orders of magnitude when faced with degraded sensor data.
Achieve autonomous laparoscope control by translating multimodal surgical data into a single "wrench" that guides the robot's movements.
Forget digital watermarks – now you can physically fingerprint solutions with electrochemically-generated polymer patterns, opening doors to low-cost, physically-encrypted personal information.
Hand-eye calibration gets a 67% accuracy boost in high-uncertainty scenarios thanks to a new optimization framework that cleverly avoids explicit uncertainty modeling.
AI is enabling a new generation of AUV navigation systems that overcome the limitations of traditional model-based approaches in complex underwater environments.
Forget complex assembly: this 3D printing technique lets you pop out functional, self-folding robots with integrated sensors and actuators directly from a flat sheet.
Robotic manipulation gets a serious upgrade: ConsisVLA-4D boosts performance by up to 41.5% and speeds up inference by 2.4x, all while ensuring your robot understands the scene in 3D *and* how it changes over time.
Guaranteeing robot safety in unknown environments doesn't require complex planning – this closed-form CBF filter does it with minimal computation.
Standard camera auto-exposure is blind to the needs of remote heart-rate monitoring, but a new method closes the gap to enable robust in-vehicle driver monitoring.
By grounding temporal Gaussian aggregation in spatial voxels, Ground4D achieves state-of-the-art 4D reconstruction in challenging off-road environments where existing methods falter.
Stop feeding LLMs redundant and conflicting sensor data in autonomous driving: a new architecture slashes hallucinated entities by coordinating multi-sensor inputs *before* reasoning.