Search papers, labs, and topics across Lattice.
100 papers published across 7 labs.
A 10kg quadrupedal robot, LIMBERO, can now climb steep, rocky surfaces thanks to a novel gripper design that achieves exceptional grasping performance with minimal weight.
Achieve real-time online learning for model predictive control with a novel spatio-temporal Gaussian Process approximation that maintains constant computational complexity.
By explicitly reasoning in 3D, VolumeDP leaps ahead of 2D-based imitation learning methods, achieving a remarkable 14.8% improvement on the LIBERO benchmark and robust real-world generalization.
By iteratively reasoning over video snippets with a Chain-of-Thought, $\text{R}^2$VLM achieves state-of-the-art long-horizon task progress estimation without needing to process entire videos at once.
Ditching rigid digital twins for adaptable world models could unlock truly intelligent edge computing in 6G networks.
Achieve real-time online learning for model predictive control with a novel spatio-temporal Gaussian Process approximation that maintains constant computational complexity.
By explicitly reasoning in 3D, VolumeDP leaps ahead of 2D-based imitation learning methods, achieving a remarkable 14.8% improvement on the LIBERO benchmark and robust real-world generalization.
By iteratively reasoning over video snippets with a Chain-of-Thought, $\text{R}^2$VLM achieves state-of-the-art long-horizon task progress estimation without needing to process entire videos at once.
Ditching rigid digital twins for adaptable world models could unlock truly intelligent edge computing in 6G networks.
LLMs can be prompted to generate part-aware instructions that substantially improve open-vocabulary 3D affordance grounding by linking semantically similar affordances and refining geometric differentiation.
Forget complex communication protocols – this trust-based algorithm lets agents learn to cooperate in competitive environments with minimal overhead.
By treating 3D scene editing as goal-regressive planning rather than pure generation, Edit-As-Act achieves instruction fidelity, semantic consistency, and physical plausibility that existing methods miss.
Legged robots can navigate more reliably with noisy sensors thanks to a new state estimator that avoids Gaussian noise assumptions.
Achieve stable, real-time kilometer-scale autonomous driving simulations by generating vector-graph tiles incrementally using a novel diffusion flow approach.
Forget verbose instructions: this new VLN paradigm uses floor plans to guide navigation with concise commands, boosting success rates by 60%.
Robots can now navigate based on your spoken preferences and visual context, thanks to a clever fusion of VLMs, LLMs, and multi-objective RL.
Locomotion policies, often considered black boxes, can autonomously learn interpretable phase structures and branching logic, revealing a hidden order in their decision-making.
Network coding, often overlooked in robotics, can drastically improve the reliability and timeliness of multi-robot communication, outperforming traditional retransmission methods in safety-critical scenarios.
Ergodic control lets swarms of robots cooperatively manufacture micro-patterned surfaces, unlocking scalable production of materials with enhanced physical properties.
A wearable hand exoskeleton that prioritizes comfort and adaptability unlocks scalable robot learning by enabling direct policy training from raw visual data, bypassing complex post-processing.
Robots often ignore your commands mid-task, but ReSteer offers a way to fix this by pinpointing and patching the "blind spots" in their training data.
Ditch costly PIDE integration: RHYME-XT learns the flow map directly, offering a continuous-time, discretization-invariant representation that beats state-of-the-art neural operators.
Robots can now nimbly navigate complex, multi-floor environments without prior training, thanks to a new strategy that dynamically switches between exploration, recovery, and memory recall.
Legged robots can now perform robust parkour with a 1-meter visual blind zone, thanks to a novel architecture that tightly couples vision, proprioception, and physics-based state estimation.
Synthetic data and virtual environments are rapidly becoming indispensable for autonomous driving, but realizing their full potential requires tackling challenges like Sim2Real transfer and scalable safety validation.
Achieve state-of-the-art semantic 3D reconstruction from sparse views by intelligently pruning redundant Gaussians and blending 2D and 3D semantic cues.
Synthesizing realistic 6-DOF object manipulation trajectories in complex 3D environments just got a whole lot better with GMT, a multimodal transformer that substantially outperforms existing methods.
Cycle consistency training unlocks stable and accurate inverse kinematics for wearable soft robots, even with their inherent nonlinearities and hysteresis.
Representing highly nonlinear vehicle dynamics in a lifted linear space via Koopman operator theory enables state-of-the-art long-term state estimation for complex electric trucks.
LLMs can act as effective action-level supervisors in reinforcement learning, dramatically boosting the sample efficiency of SAC without sacrificing convergence guarantees.
Forget rigid physics engines, this badminton RL environment uses real player data to simulate realistic rallies and strategic gameplay.
Heuristic maritime routes lead to extreme fuel waste in nearly 5% of voyages, but this RL approach cuts that risk by almost 10x.
LLMs in embodied environments get a massive boost from structured rules, with rule retrieval alone contributing +14.9 pp to single-trial success.
LLMs struggle with spatial reasoning in embodied settings and 3D structure identification even when exposed to visual modalities, but fine-tuning smaller models offers a surprisingly effective alternative to brute-force scaling.
Animate 3D characters using bananas and plush toys – DancingBox turns everyday objects into motion capture proxies, making animation accessible to novices.
VLN agents can navigate more effectively by predicting their future states and proactively planning based on forecasted semantic map cues, rather than relying solely on historical context.
Forget training wheels: GoalVLM lets multi-agent robots navigate to any object you describe, no pre-programmed categories needed.
Encoding deformable object dynamics with particle positions unlocks sim-to-real transfer for manipulation tasks, achieving impressive real-world success rates.
Drones can now land safely in complex, unknown environments using only a camera, thanks to a new system that dynamically maps and reacts to surroundings in real-time.
Ditch fixed compute budgets: this new flow-matching method for robotic control adaptively allocates computation, speeding up simple tasks and focusing on complex ones.
Scene graphs plus LLMs let robots ask clarifying questions, boosting multi-agent task success by 15%.
ManiDreams lets robots handle real-world uncertainty in manipulation tasks without retraining, outperforming standard RL baselines under various perturbations.
Forget rigid circuits - this new method seamlessly weaves stretchable sensors directly into clothing using a clever combo of 3D printing and embroidery.
Unlock accurate monocular 3D object tracking with minimal annotation: Sparse3DTrack achieves state-of-the-art performance using only a handful of labels per track.
Robot world models can be significantly improved by directly rewarding them for generating videos that lead to physically plausible robot actions, even if the videos themselves contain visual artifacts.
A national center focused on AI and robotics in medicine could be the key to unlocking the transformative potential of these technologies in healthcare.
Continuous, high-resolution shape sensing in steerable drilling robots is now possible without directly embedding sensors on the instrument surface, thanks to a clever OFDR-based assembly.
Forget centralized control: this algorithm lets swarms of robots build complex shapes with only local communication and no global positioning.
A complete autonomy stack enables centimeter-level localization and mapping on the moon, even without GPS.
Running robotic manipulation workloads entirely onboard kills robot batteries, but offloading to the cloud tanks accuracy due to network latency, revealing a critical compute placement trade-off.
SpiderCam shatters power consumption barriers for FPGA-based 3D cameras, achieving sub-Watt operation while maintaining real-time performance.
Exploiting geometric symmetries in tensegrity structures slashes computational cost and boosts accuracy in physics-informed neural networks.
Ditch slow diffusion policies: FMER achieves 7x faster training and superior performance in sparse reward RL by using flow matching and a tractable entropy regularization term.
Finally, a rigorous RL benchmark: generate environments with *provably* optimal policies, enabling controlled algorithm evaluation against ground truth.
Demonstrator diversity unlocks the ability to learn latent actions and dynamics from offline RL data, even without explicit action labels.
Unlock scalable aerial scene understanding with SegFly, a massive RGB-T dataset generated via a novel 2D-3D-2D label propagation technique that requires minimal manual annotation.
Ditch the diffusion vs. autoregressive debate: this VLA framework uses diffusion to *draft* actions and an autoregressive model to *verify* them, boosting real-world success by nearly 20%.
Achieve SE(3) equivariance and memory scalability in point cloud analysis with coordinate-based kernels, outperforming state-of-the-art equivariant methods on diverse tasks.
By cleverly turning novel view synthesis into a self-supervised inpainting problem, VisionNVS eliminates the need for ground truth images of novel views, outperforming LiDAR-dependent baselines.
Unlock the power of MLLMs for structured data like human skeletons with a differentiable rendering approach that allows end-to-end training.
By fusing IMU-derived egomotion with visual data, Motion-MLLM lets MLLMs achieve SOTA 3D scene understanding with 40% less compute.
Achieve 100x radar data compression with only a 1% performance drop by adaptively pruning DCT coefficients based on detection confidence gradients.
Forget waiting minutes for iterative optimization – Omni-3DEdit performs diverse 3D editing tasks in a single forward pass.
Skip the costly training and go straight to open-vocabulary 3D reasoning with ReLaGS, which builds a 3D semantic scene graph from language-distilled Gaussians.
A new RGB-T dataset and frequency-aware network exposes the surprising limitations of existing UAV detectors when faced with real-world camouflage and complex backgrounds.
Achieve zero-shot adaptation to new tasks in complex control environments by learning a shared low-dimensional goal embedding that unifies policy and value function representations.
Forget retargeting: RoboForge's physics-optimized pipeline lets humanoids nail text-guided locomotion with better accuracy and stability.
NeRFs can now guide extraterrestrial rovers around unexpected obstacles, thanks to a novel planning framework that blends local observations with global terrain understanding.
Robot control gets a whole lot faster: ProbeFlow slashes action decoding latency by 14.8x in Vision-Language-Action models, all without retraining.
Q-value policies, traditionally outperformed by state-value policies in planning, can surpass them with the right regularization, offering a faster alternative for policy evaluation.
Panoramic 3D reconstruction gets a boost with PanoVGGT, a Transformer that handles spherical distortions and global-frame ambiguity to deliver state-of-the-art accuracy in a single pass.
Gesture-aware pretraining unlocks significant improvements in 3D hand pose estimation, proving that semantic gesture information acts as a powerful inductive bias.
By co-training flow and retrieval networks, WINFlowNets eliminates the need for pre-training, unlocking CFlowNets for dynamic robotic environments where data is scarce.
Lunar rovers can now navigate more accurately across vast distances thanks to a new SLAM system that uses readily available Digital Elevation Models to correct visual drift.
Robots can now plan 9x faster and achieve significantly higher success rates by decoupling action prediction from video generation in World-Action Models.
Achieve more precise robot control by explicitly disentangling high-level goals from low-level kinematic instructions.
Generalizing RL to continuous state and action spaces just got easier: this paper introduces an operator-theoretic framework and PPO-type algorithms that ditch finite-state assumptions.
A new mixed reality testbed lets you plug real human drivers into a CAV simulation, offering unprecedented realism for testing autonomous vehicle interactions.
Guaranteeing robot safety and task completion just got easier: this method enforces complex temporal logic constraints on pre-trained robotics models without any fine-tuning.
LLMs can navigate complex 3D environments more effectively and with far fewer tokens by using a hierarchical scene graph representation derived from omnidirectional sensor data.
Autonomous vehicles can now leverage the rich semantic understanding of VLMs for safer driving without the computational overhead, thanks to a clever training strategy that distills VLM knowledge into a real-time RL policy.
Policies trained on DexViTac's multimodal dataset achieve over 85% success in real-world dexterous manipulation, proving that high-fidelity tactile data unlocks a new level of robotic dexterity.
Human unpredictability is now a feature, not a bug: a mixed-reality testing framework leverages human interaction to generate high-quality corner cases for vehicle-infrastructure cooperation systems.
VLMs can now drive embodied agents to navigate complex environments with unprecedented efficiency, thanks to a novel framework that bridges the gap between 2D semantic understanding and 3D spatial reasoning.
ROS 2's real-time performance gets a major boost with ReDAG-RT, a user-space scheduler that cuts deadline misses by up to 30% without touching the core ROS 2 API.
Don't let your robot's brief moment of panic get lost in the noise – this new uncertainty method spotlights those critical spikes to predict failures before they happen.
Robots can think (and act) twice as fast: HeiSD's hybrid speculative decoding turbocharges embodied agents by intelligently switching between draft and retrieval strategies.
Human-robot teams can get a boost: eye-tracking data alone can predict when a human teammate is struggling to understand the robot's situation with nearly 90% recall.
Ditch LiDAR: 3D Gaussian Splatting, combined with semantic segmentation and stereo depth, enables real-time lunar mapping with centimeter-level accuracy.
Stop wasting compute: this RL-trained orchestration policy adaptively decides when your embodied agent should reason with an LLM, slashing latency and boosting task success compared to fixed strategies.
A $50 DIY syringe pump enables precise bidirectional control of soft robots, unlocking new possibilities for complex shape-shifting behaviors.
Kinema4D unlocks zero-shot transfer in embodied AI by simulating physically plausible 4D robot-world interactions, moving beyond rigid 2D constraints.
User-facing guardrails for LLM-enabled robots can balance flexibility and safety by offering constrained choices and clear recourse, rather than open-ended value settings.
Fine-tuning Vision-Language Model planners for robotic manipulation is now significantly more efficient and safer thanks to a novel framework that leverages video world models to simulate real-world physics.
Autonomous robots can now more safely and effectively inspect cluttered, radioactive environments by combining information gain-based planning with stochastic obstacle avoidance.
Finally, a unified software framework promises to tame the wild west of quantum dot device tuning, enabling researchers to share and adapt characterization routines across labs.
A quadrupedal robot can now provide on-demand assistance to wheelchair users, offering a more agile and less intrusive alternative to fixed robotic arms.
Neural approximations of Hamilton-Jacobi reachability can now be formally certified for safety, enabling provably safe robot navigation in unknown environments.
By blending geometry with classification, this new Finsler metric lets you trace trajectories more accurately through complex systems, like cell development, where you have both spatial data and lineage trees.
PyPhonPlan offers a new open-source toolkit to simulate speech dynamics with neurally-grounded representations, enabling researchers to model interactive speech production and perception loops.
You can provably find Nash equilibria even when one player only knows the *reaction* of the other, not their full objective.
Visual SLAM loop closure just got a whole lot faster: FastLoop achieves up to 3x speedups by unleashing the power of GPU parallelism.
Rank-1 LoRA fine-tuning can safely and efficiently adapt simulated locomotion policies to real-world robots, slashing fine-tuning time by nearly half while maintaining safety.
A 10kg quadrupedal robot, LIMBERO, can now climb steep, rocky surfaces thanks to a novel gripper design that achieves exceptional grasping performance with minimal weight.
Achieve intention-driven start-stop control of a rehabilitation exoskeleton from non-invasive EEG by fixing a common bias in task-based recentering.