Search papers, labs, and topics across Lattice.
100 papers published across 9 labs.
Pokemon, not just a childhood game, emerges as a surprisingly effective benchmark for AI, revealing critical gaps in LLMs and RL agents that existing benchmarks miss.
Forget painstakingly crafting rewards and curricula – this new RL framework learns surprisingly dexterous manipulation skills just by resetting the simulator in diverse ways.
Achieve real-time online learning for model predictive control with a novel spatio-temporal Gaussian Process approximation that maintains constant computational complexity.
By iteratively reasoning over video snippets with a Chain-of-Thought, $\text{R}^2$VLM achieves state-of-the-art long-horizon task progress estimation without needing to process entire videos at once.
Ditching rigid digital twins for adaptable world models could unlock truly intelligent edge computing in 6G networks.
Achieve real-time online learning for model predictive control with a novel spatio-temporal Gaussian Process approximation that maintains constant computational complexity.
By iteratively reasoning over video snippets with a Chain-of-Thought, $\text{R}^2$VLM achieves state-of-the-art long-horizon task progress estimation without needing to process entire videos at once.
Ditching rigid digital twins for adaptable world models could unlock truly intelligent edge computing in 6G networks.
By treating 3D scene editing as goal-regressive planning rather than pure generation, Edit-As-Act achieves instruction fidelity, semantic consistency, and physical plausibility that existing methods miss.
Legged robots can navigate more reliably with noisy sensors thanks to a new state estimator that avoids Gaussian noise assumptions.
Achieve stable, real-time kilometer-scale autonomous driving simulations by generating vector-graph tiles incrementally using a novel diffusion flow approach.
Seemingly accurate physics-informed surrogates can fail spectacularly when integrated into power system simulations, especially under stress, highlighting the need for rigorous in-simulator validation.
Generate consistent stereo videos directly from RGB data, bypassing depth estimation and monocular-to-stereo conversion, with StereoWorld's novel camera-aware attention mechanisms.
Representing highly nonlinear vehicle dynamics in a lifted linear space via Koopman operator theory enables state-of-the-art long-term state estimation for complex electric trucks.
Simulate earthquake ground motion 10,000x faster with a new latent operator flow matching method, opening the door to real-time risk assessment for critical infrastructure.
Forget rigid physics engines, this badminton RL environment uses real player data to simulate realistic rallies and strategic gameplay.
Heuristic maritime routes lead to extreme fuel waste in nearly 5% of voyages, but this RL approach cuts that risk by almost 10x.
LLMs in embodied environments get a massive boost from structured rules, with rule retrieval alone contributing +14.9 pp to single-trial success.
VLN agents can navigate more effectively by predicting their future states and proactively planning based on forecasted semantic map cues, rather than relying solely on historical context.
Encoding deformable object dynamics with particle positions unlocks sim-to-real transfer for manipulation tasks, achieving impressive real-world success rates.
Drones can now land safely in complex, unknown environments using only a camera, thanks to a new system that dynamically maps and reacts to surroundings in real-time.
Ditch fixed compute budgets: this new flow-matching method for robotic control adaptively allocates computation, speeding up simple tasks and focusing on complex ones.
ManiDreams lets robots handle real-world uncertainty in manipulation tasks without retraining, outperforming standard RL baselines under various perturbations.
Robot world models can be significantly improved by directly rewarding them for generating videos that lead to physically plausible robot actions, even if the videos themselves contain visual artifacts.
A complete autonomy stack enables centimeter-level localization and mapping on the moon, even without GPS.
Finally, a rigorous RL benchmark: generate environments with *provably* optimal policies, enabling controlled algorithm evaluation against ground truth.
Accurately predict urban pollutant dispersion in real-time with a novel data-driven model that's orders of magnitude faster than traditional CFD.
Demonstrator diversity unlocks the ability to learn latent actions and dynamics from offline RL data, even without explicit action labels.
LLMs can be economically aligned to real-world consumer preferences via post-training on transaction data, enabling more accurate and stable economic simulations.
By cleverly turning novel view synthesis into a self-supervised inpainting problem, VisionNVS eliminates the need for ground truth images of novel views, outperforming LiDAR-dependent baselines.
Forget finetuning: DynaEdit unlocks complex video edits like action modification and object insertion, all without training, using clever manipulation of pretrained text-to-video models.
Achieve zero-shot adaptation to new tasks in complex control environments by learning a shared low-dimensional goal embedding that unifies policy and value function representations.
NeRFs can now guide extraterrestrial rovers around unexpected obstacles, thanks to a novel planning framework that blends local observations with global terrain understanding.
Q-value policies, traditionally outperformed by state-value policies in planning, can surpass them with the right regularization, offering a faster alternative for policy evaluation.
Robots can now plan 9x faster and achieve significantly higher success rates by decoupling action prediction from video generation in World-Action Models.
A new mixed reality testbed lets you plug real human drivers into a CAV simulation, offering unprecedented realism for testing autonomous vehicle interactions.
Guaranteeing robot safety and task completion just got easier: this method enforces complex temporal logic constraints on pre-trained robotics models without any fine-tuning.
Human unpredictability is now a feature, not a bug: a mixed-reality testing framework leverages human interaction to generate high-quality corner cases for vehicle-infrastructure cooperation systems.
Autoregressive neural surrogates can now simulate dynamical systems for infinitely long horizons, thanks to a novel self-refining diffusion model that avoids error compounding.
Ditch the data augmentation and decoders: R2-Dreamer's Barlow Twins-inspired objective delivers faster, more versatile MBRL, especially when spotting the small stuff matters.
Kinema4D unlocks zero-shot transfer in embodied AI by simulating physically plausible 4D robot-world interactions, moving beyond rigid 2D constraints.
Fine-tuning Vision-Language Model planners for robotic manipulation is now significantly more efficient and safer thanks to a novel framework that leverages video world models to simulate real-world physics.
Autonomous robots can now more safely and effectively inspect cluttered, radioactive environments by combining information gain-based planning with stochastic obstacle avoidance.
Neural approximations of Hamilton-Jacobi reachability can now be formally certified for safety, enabling provably safe robot navigation in unknown environments.
PyPhonPlan offers a new open-source toolkit to simulate speech dynamics with neurally-grounded representations, enabling researchers to model interactive speech production and perception loops.
LLM-based simulations of public opinion suffer from "Diversity Collapse," but injecting explicit social identity representations into hidden states can fix it.
Rank-1 LoRA fine-tuning can safely and efficiently adapt simulated locomotion policies to real-world robots, slashing fine-tuning time by nearly half while maintaining safety.
By fusing IMU and insole pressure data within a physics simulation, GRIP achieves more physically plausible human motion capture than IMU-only methods.
Accurately simulating the snap-fit mechanics of interlocking bricks, BrickSim unlocks a new level of realism for robotic manipulation research involving complex assemblies.
By reframing robot inspection planning as a network flow problem, this work achieves a 30-50% reduction in optimality gaps and scales to instances previously intractable for state-of-the-art methods.
Humanoid robots can now nimbly navigate complex terrain with drastically reduced computational cost thanks to a novel adaptive sensing architecture.
A lightweight transformer can accurately forecast diverse aircraft trajectories in complex airspace, outperforming prior methods and enabling real-time safety applications.
Reinforcement learning can now orchestrate the complex, whole-body movements of salamander robots, enabling seamless transitions between walking and swimming.
A MuJoCo-based MPC can effectively control shipboard cranes in real-time, even with double-pendulum sway and external perturbations, outperforming traditional PID and RL methods on embedded hardware.
Smarter placement of slow chargers can significantly reduce the need for expensive en-route EV charging, leading to lower overall system costs.
Multimodal agents can now plan more coherently and solve complex tasks thanks to a new anticipatory reasoning framework that forecasts short-horizon trajectories before acting.
Generating 3D scenes with diffusion models just got a whole lot more consistent across views, thanks to a new 3D-native approach that skips the 2D latent space bottleneck.
Stochastic resetting—randomly teleporting RL agents back to the start—surprisingly speeds up learning, even when it wouldn't help a non-learning agent.
By penalizing treatment plans that lead to trajectory distributions far from observed patient data, this method provides a more robust approach to treatment optimization than standard model-based methods.
Transformers trained on a simple grid-world learn hidden representations that directly reflect the underlying predictive geometry, offering a glimpse into how neural networks internalize structural constraints.
LLMs can learn to recover from mistakes more effectively by reflecting on past failures and internalizing actionable feedback, leading to significant gains in long-horizon problem-solving.
Skip the expensive modeling step: this data-driven approach to traffic light control directly optimizes traffic flow using real-world data, slashing travel times and emissions in a massive Zürich simulation.
Kinodynamic motion planning just got a whole lot faster: AkinoPDF achieves microsecond-level planning times by exploiting differential flatness for analytical solutions.
Ditch slow, multi-step video generation: S-VAM distills the structured generative priors of multi-step denoising into a single forward pass for real-time robot action prediction.
By treating camera pose as a unifying geometric representation, WorldCam achieves significantly improved action controllability and long-horizon 3D consistency in interactive gaming world models compared to prior video diffusion transformer approaches.
By amortizing sequential design into a neural network, this method achieves real-time model-based design of experiments, unlocking new possibilities for efficient parameter estimation in complex dynamical systems.
Forget expensive real-world data collection: a massive, diverse synthetic dataset enables surprisingly effective zero-shot transfer for robotic manipulation.
Turn inconsistent video diffusion models into surprisingly coherent 3D world generators with a novel alignment and rendering approach.
Coordinating fleets of autonomous vessels to clean up multiple oil spills can be near-optimized in minutes using a hybrid optimization approach, enabling rapid, risk-aware responses to large-scale disasters.
Reinforcement learning can effectively control collective animal behavior in the real world, even when individuals frequently ignore the artificial stimulus.
Constraint propagation can significantly boost dynamic programming by pruning states and transitions, but the overhead needs further optimization.
World Action Models can ditch the slow, iterative "imagine-then-execute" loop at test time without sacrificing performance, achieving a 4x speedup.
Forget kinematic tree approximations: Kamino unlocks high-fidelity, massively parallel robot simulations with closed kinematic chains directly on GPUs.
Competitive reinforcement learning enables agile drone interception with higher catch rates and lower crash rates compared to heuristic baselines, even in real-world scenarios.
Skip the manual effort: CABTO uses large models to automatically generate complete and consistent behavior tree systems for robot manipulation.
LLM Transformers can be effectively repurposed to enhance motion forecasting in autonomous driving by capturing temporal context in continuous driving scenarios.
Automating vehicle fault diagnostics by treating error codes as a language unlocks scalable predictive maintenance and causal understanding in complex automotive systems.
Current MLLMs struggle with even basic route planning in remote sensing, highlighting a critical gap in their ability to translate perception into action in complex, real-world scenarios.
Robots can now dynamically adjust their movements for legibility versus efficiency on the fly, without retraining, by using a lightweight module that detects environmental ambiguity and modulates a diffusion policy.
End-to-end autonomous driving systems, like Tesla's FSD, are proving commercially viable by effectively handling the long tail of real-world driving scenarios, signaling a major shift from rule-based approaches.
LLMs can now plan complex, sequential robotic maneuvers through narrow spaces by learning from human demos and refining with geometric rewards, outperforming traditional methods.
Forget finetuning a new LoRA for every character: EverTale introduces a single LoRA that adapts to *all* characters in a story, enabling continuous character customization with improved fidelity and efficiency.
Achieve minute-level navigable video world models by combining the strengths of explicit 3D patch memory with implicit generative modeling.
Just five minutes of real-world teleoperation data is enough to train a copilot that significantly boosts both novice and expert performance on complex manipulation tasks.
Unlock globally optimal control policies in high-dimensional systems by unifying trajectory optimization with Hamilton-Jacobi-Bellman methods via a novel "Featurized Occupation Measure" framework.
LLMs can exhibit surprising "strategic realism" when analyzing an ongoing geopolitical conflict, but their reasoning falters in politically ambiguous situations, revealing critical domain-specific limitations.
Humanoid robots can now learn to walk with provably direction-dependent compliance, thanks to a new anisotropic Lipschitz constraint on RL policies.
LLM agents struggle to maintain coherent decision-making in realistic retail environments over long horizons, even with a novel framework for adaptive strategy evolution.
Finally, a method exists to create 3D human-scene interaction models from casual captures that are stable enough for use in physics simulations and deployment on real-world robots.
Humanoid robots can now handle heavy, unknown payloads in the real world thanks to a system that identifies mass distribution via differentiable simulation.
Overcome communication bottlenecks in multi-agent RL by selectively communicating with reachable agents and predicting interference to optimize partner choice.
UAVs can explore longer and more efficiently by explicitly optimizing for energy consumption, as demonstrated by a new frontier exploration framework that reduces energy use without sacrificing speed or map quality.
Pokemon, not just a childhood game, emerges as a surprisingly effective benchmark for AI, revealing critical gaps in LLMs and RL agents that existing benchmarks miss.
Autonomous driving planners can now explicitly self-correct unsafe actions by generating motion-token traces conditioned on a learned collision critic, leading to significant safety improvements.
Forget painstakingly crafting rewards and curricula – this new RL framework learns surprisingly dexterous manipulation skills just by resetting the simulator in diverse ways.
Imagine a world model that doesn't just dream up environments, but flawlessly renders a real city like Seoul, complete with text-prompted scenarios and diverse camera movements.
Finally, a scalable method lets you explore billions of scientific models and their parameters, all while interactively tuning model complexity *after* seeing the data.
Choosing the right formalism for robot mission specification—Behavior Trees, State Machines, HTNs, or BPMN—can make or break your robot's ability to handle real-world complexity.
CRL struggles with hard-to-reach goals, but ViSA, a new data augmentation technique, solves this by generating synthetic states and regularizing the embedding space, leading to better value estimation.
WorldDrive achieves leading autonomous driving performance by unifying visual scene generation and motion planning, demonstrating that a shared representation space significantly improves both prediction accuracy and planning robustness.
Achieve SOTA extrapolated-view LiDAR synthesis by fusing multi-frame LiDAR data and spatially-constrained dropout regularization, enabling robust autonomous driving simulation without multi-pass data.
Achieve state-of-the-art closed-loop autonomous driving simulation with sub-second latency using a novel frame-autoregressive video generation framework.
A new policy iteration algorithm, iPI, closes the gap between existing safety verification methods by matching the best-case runtime of TarjanSafe while guaranteeing polynomial worst-case scaling.
Infant motor learning reveals a sharp phase transition in control strategy arbitration, governed by context window size and predictable via a closed-form exponential moving average.
Quadruped robots can now learn diverse skills and adapt to complex terrains without expert datasets, thanks to a novel keyframe-guided self-imitation learning framework.