Search papers, labs, and topics across Lattice.
100 papers published across 5 labs.
Deep learning can rescue VIO from textureless environments and rapid lighting changes.
A million-sequence, high-quality, open-source motion dataset finally lets text-to-motion models generalize beyond toy benchmarks.
Explicitly reconstructing 3D scenes with Gaussian Splatting unlocks state-of-the-art BEV perception, proving that geometric understanding is key to accurate spatial reasoning.
Maximizing entropy of future state-action visitations boosts feature coverage within single RL trajectories, offering a new exploration strategy.
Coordinating multi-robot teams to complete manipulation tasks just got easier: GoC-MPC handles dynamic task assignments and disturbances without training data or environment models.
A million-sequence, high-quality, open-source motion dataset finally lets text-to-motion models generalize beyond toy benchmarks.
Explicitly reconstructing 3D scenes with Gaussian Splatting unlocks state-of-the-art BEV perception, proving that geometric understanding is key to accurate spatial reasoning.
Maximizing entropy of future state-action visitations boosts feature coverage within single RL trajectories, offering a new exploration strategy.
Coordinating multi-robot teams to complete manipulation tasks just got easier: GoC-MPC handles dynamic task assignments and disturbances without training data or environment models.
VLAs aren't just memorizing training data; sparse autoencoders reveal a hidden layer of generalizable motion primitives that can be steered to control robot behavior across tasks.
Forget brittle, hand-coded robot assembly routines: ATG-MoE learns complex, multi-skill manipulation directly from visual and language inputs, achieving impressive success rates in both simulation and real-world industrial tasks.
Forget hand-crafted assets and heuristics: V-Dreamer uses video generation models to automatically create diverse, physically plausible robotic simulation environments and trajectories directly from language.
A peer-like social robot can effectively augment literacy tutor support for newcomer children, offering personalized language and cultural learning in resource-constrained community settings.
Differentiable collision checking in configuration space, previously a major hurdle, is now achievable with zero-shot generalization thanks to CSSDF-Net.
Forget external sensors: embedding a simple nickel wire into a pneumatic actuator unlocks surprisingly accurate force sensing via inductance, even with hysteresis.
Information-theoretic limits on control performance are now computable even when feedback matters most, thanks to a new bound that self-consistently accounts for the controller's impact on sensor information.
Ditch the threshold: this tactile-sensing robotic hand uses contact status recognition to detect slip with 96% accuracy, even on new materials.
Humanoid robots can now traverse complex terrains with human-like gaits, thanks to a surprisingly simple and efficient framework that eschews adversarial training.
Unlock real-time 3D understanding: MonoArt achieves state-of-the-art monocular articulated object reconstruction without relying on multi-view data or external motion templates.
Achieve 9x lower trajectory error and 3x better FID in motion generation by using a diffusion-based discrete motion tokenizer that elegantly handles both semantic and kinematic constraints.
VLMs struggle with spatial reasoning, but a clever decomposition into sub-problems and probabilistic recombination unlocks significantly better metric-semantic grounding.
Autonomous driving models can be made significantly more robust and safe by explicitly de-confounding their training via causal intervention, eliminating reliance on spurious correlations.
Particle physics techniques can give your drone superhuman senses: statistical methods from CERN enable UAVs to detect subtle blade damage with calibrated uncertainty, outperforming standard anomaly detection methods.
Encoding realism as a knowledge graph of interpretable traits unlocks zero-shot sim2real image translation that outperforms state-of-the-art diffusion methods.
Achieve state-of-the-art panoramic depth estimation without any training by cleverly exploiting the 3D consistency priors embedded within existing vision foundation models.
Turns out, VLA models are mostly just looking at the scene: visual pathways dominate action generation, and language only matters when the visuals are ambiguous.
Ditch the heavyweight controllers: these lightweight MPC approaches bring real-time attitude synchronization to resource-constrained spacecraft.
Deep learning can rescue VIO from textureless environments and rapid lighting changes.
Tapered backbones in 3D-printed continuum robots unlock enhanced compliance and manipulability, all while slashing costs and assembly time.
Democratizing social robotics research, M offers a low-cost, open-source platform that's easy to reproduce, modify, and deploy in real-world settings.
A single 3D-printed part can replace complex multi-link laparoscopic graspers, slashing manufacturing costs while maintaining reliable bistable actuation.
DriveTok achieves unified multi-view reconstruction and understanding by learning scene tokens that integrate semantic, geometric, and textural information, outperforming existing 2D tokenizers in autonomous driving scenarios.
DROID-SLAM achieves robust real-time RGB SLAM in dynamic environments by explicitly modeling per-pixel uncertainty, outperforming existing methods that struggle with unknown dynamic objects and cluttered scenes.
Differentiable environments and backpropagation offer a surprisingly effective alternative to reinforcement learning for AAV trajectory optimization, sidestepping credit assignment problems.
Decentralized MPC with control barrier functions lets multi-robot quadrupeds safely navigate complex environments in real-time, achieving performance on par with centralized approaches but with significantly reduced computation.
Digital twins can now discriminate between different types of cyberattacks on critical infrastructure, enabling targeted responses instead of costly full shutdowns.
Real-time robotic perception just got a major upgrade: OnlinePG achieves open-vocabulary panoptic mapping with 3D Gaussian Splatting, enabling robots to understand and interact with environments in a way that was previously impossible.
Quadrupedal robots can now skate circles around traditional designs, thanks to a co-design approach that unlocks dynamic maneuvers like hockey stops and self-alignment.
LLMs can navigate more efficiently in unfamiliar environments by reasoning over a tree of possible paths, not just isolated waypoints, enabling them to consider en-route information gain and prune unpromising branches.
Robots can learn faster and generalize better by encoding dynamics directly into their neural network architecture, outperforming standard transformers and GNNs.
Ditch the power-hungry actuators: this passive elastic-folding mechanism lets you stack and airdrop sensors that reliably self-deploy into 3D structures.
Reconstructing realistic hand-object interactions from video just got an order of magnitude faster, thanks to a novel Gaussian Splatting approach that ensures physical consistency.
Overcoming occlusion in hand-object pose estimation just got easier: GenHOI leverages hierarchical semantic knowledge and hand priors to achieve state-of-the-art results on challenging benchmarks.
Hybrid LiDAR-inertial-visual odometry (LIVO) robustly handles visually challenging conditions, outperforming sparse-direct methods by combining direct photometric methods with learning-based feature descriptors.
Humanoid robots can now generate more empathetic and instruction-aware gestures thanks to a new diffusion framework conditioned on affective estimation and pedagogical reasoning.
Forget painstakingly designing simulation environments: generative 3D world models let you RL-fine-tune robot VLAs with massive scene diversity, boosting real-world transfer by 3x.
Unlock real-time control for massive multi-agent swarms: this method slashes computation from cubic to linear with horizon length, making long-horizon density-driven control practical.
Decomposing GUI agent trajectories into verifiable milestones and auditing the evidence chain yields a 10% boost in RL training performance, outperforming single-judge reward systems.
Flow-based VLAs can react to environmental changes ten times faster by adaptively prioritizing near-term actions during sampling, unlocking unprecedented real-time responsiveness.
Seemingly efficient VLA models can be surprisingly inefficient when deployed on robots, highlighting the need to move beyond standard metrics like FLOPs and parameters.
Neural solvers can now effectively handle the complexities of multi-agent coordination and multi-objective trade-offs in routing problems, outperforming traditional heuristics.
Embodied navigation agents, already struggling, fall apart when faced with the kinds of messy, real-world sensor and instruction corruptions that NavTrust now exposes.
Optimal multi-agent path planning with asynchronous actions is now provably complete, sidestepping the theoretical incompleteness of prior continuous-time approaches.
Forget blind exploration: injecting LLM-derived semantic understanding into DRL dramatically boosts UAV-aided network connectivity and slashes energy consumption.
Unlock geometry-precise 3D generation by directly conditioning diffusion models on readily available point cloud priors, outperforming existing image- or text-conditioned methods.
Guaranteeing safety in spacecraft autonomy is now more tractable: a CBF-CLF informed imitation learning approach achieves NMPC-level performance with real-time feasibility on commodity hardware.
Agents can now "hallucinate" optimal viewpoints for reasoning by storing and re-rendering scenes with 3D Gaussian Splatting, enabling recovery from initial observation failures.
Hierarchical memory, inspired by human cognition, beats standard approaches in robotic manipulation tasks requiring both precise tracking and long-term retention.
Robots can now manipulate objects with greater dexterity and adaptability thanks to a new world model that leverages both vision and high-frequency tactile feedback to predict and react to contact dynamics.
Tactile sensing closes the sim2real gap for deformable object tracing, enabling a single imitation learning model to achieve impressive generalization across diverse objects.
Standard DRL collapses in volatile environments because it mistakes irreducible noise for a lack of data, but RE-SAC fixes this by explicitly separating these uncertainties.
Robots can now train in realistic, thermally-accurate simulated fires, paving the way for safer and more reliable real-world firefighting deployments.
LLMs can control robots for complex disassembly tasks, but only if you give them structured APIs – otherwise, expect a 43% failure rate.
Even the most advanced VLMs like GPT-4o, GPT-5 and Gemini 2.5 Flash are outperformed in multi-actor human-robot interaction grounding by a system that selectively invokes VLMs based on a lightweight perception pipeline.
Achieve real-time online learning for model predictive control with a novel spatio-temporal Gaussian Process approximation that maintains constant computational complexity.
By explicitly reasoning in 3D, VolumeDP leaps ahead of 2D-based imitation learning methods, achieving a remarkable 14.8% improvement on the LIBERO benchmark and robust real-world generalization.
By iteratively reasoning over video snippets with a Chain-of-Thought, $\text{R}^2$VLM achieves state-of-the-art long-horizon task progress estimation without needing to process entire videos at once.
Ditching rigid digital twins for adaptable world models could unlock truly intelligent edge computing in 6G networks.
LLMs can be prompted to generate part-aware instructions that substantially improve open-vocabulary 3D affordance grounding by linking semantically similar affordances and refining geometric differentiation.
Forget complex communication protocols – this trust-based algorithm lets agents learn to cooperate in competitive environments with minimal overhead.
By treating 3D scene editing as goal-regressive planning rather than pure generation, Edit-As-Act achieves instruction fidelity, semantic consistency, and physical plausibility that existing methods miss.
Legged robots can navigate more reliably with noisy sensors thanks to a new state estimator that avoids Gaussian noise assumptions.
Achieve stable, real-time kilometer-scale autonomous driving simulations by generating vector-graph tiles incrementally using a novel diffusion flow approach.
Forget verbose instructions: this new VLN paradigm uses floor plans to guide navigation with concise commands, boosting success rates by 60%.
Robots can now navigate based on your spoken preferences and visual context, thanks to a clever fusion of VLMs, LLMs, and multi-objective RL.
Locomotion policies, often considered black boxes, can autonomously learn interpretable phase structures and branching logic, revealing a hidden order in their decision-making.
Network coding, often overlooked in robotics, can drastically improve the reliability and timeliness of multi-robot communication, outperforming traditional retransmission methods in safety-critical scenarios.
Ergodic control lets swarms of robots cooperatively manufacture micro-patterned surfaces, unlocking scalable production of materials with enhanced physical properties.
A wearable hand exoskeleton that prioritizes comfort and adaptability unlocks scalable robot learning by enabling direct policy training from raw visual data, bypassing complex post-processing.
Robots often ignore your commands mid-task, but ReSteer offers a way to fix this by pinpointing and patching the "blind spots" in their training data.
Ditch costly PIDE integration: RHYME-XT learns the flow map directly, offering a continuous-time, discretization-invariant representation that beats state-of-the-art neural operators.
Robots can now nimbly navigate complex, multi-floor environments without prior training, thanks to a new strategy that dynamically switches between exploration, recovery, and memory recall.
Legged robots can now perform robust parkour with a 1-meter visual blind zone, thanks to a novel architecture that tightly couples vision, proprioception, and physics-based state estimation.
Synthetic data and virtual environments are rapidly becoming indispensable for autonomous driving, but realizing their full potential requires tackling challenges like Sim2Real transfer and scalable safety validation.
Achieve state-of-the-art semantic 3D reconstruction from sparse views by intelligently pruning redundant Gaussians and blending 2D and 3D semantic cues.
Synthesizing realistic 6-DOF object manipulation trajectories in complex 3D environments just got a whole lot better with GMT, a multimodal transformer that substantially outperforms existing methods.
Cycle consistency training unlocks stable and accurate inverse kinematics for wearable soft robots, even with their inherent nonlinearities and hysteresis.
Representing highly nonlinear vehicle dynamics in a lifted linear space via Koopman operator theory enables state-of-the-art long-term state estimation for complex electric trucks.
LLMs can act as effective action-level supervisors in reinforcement learning, dramatically boosting the sample efficiency of SAC without sacrificing convergence guarantees.
Forget rigid physics engines, this badminton RL environment uses real player data to simulate realistic rallies and strategic gameplay.
Heuristic maritime routes lead to extreme fuel waste in nearly 5% of voyages, but this RL approach cuts that risk by almost 10x.
LLMs in embodied environments get a massive boost from structured rules, with rule retrieval alone contributing +14.9 pp to single-trial success.
LLMs struggle with spatial reasoning in embodied settings and 3D structure identification even when exposed to visual modalities, but fine-tuning smaller models offers a surprisingly effective alternative to brute-force scaling.
Animate 3D characters using bananas and plush toys – DancingBox turns everyday objects into motion capture proxies, making animation accessible to novices.
VLN agents can navigate more effectively by predicting their future states and proactively planning based on forecasted semantic map cues, rather than relying solely on historical context.
Forget training wheels: GoalVLM lets multi-agent robots navigate to any object you describe, no pre-programmed categories needed.
Encoding deformable object dynamics with particle positions unlocks sim-to-real transfer for manipulation tasks, achieving impressive real-world success rates.
Drones can now land safely in complex, unknown environments using only a camera, thanks to a new system that dynamically maps and reacts to surroundings in real-time.
Ditch fixed compute budgets: this new flow-matching method for robotic control adaptively allocates computation, speeding up simple tasks and focusing on complex ones.
Scene graphs plus LLMs let robots ask clarifying questions, boosting multi-agent task success by 15%.
ManiDreams lets robots handle real-world uncertainty in manipulation tasks without retraining, outperforming standard RL baselines under various perturbations.
Forget rigid circuits - this new method seamlessly weaves stretchable sensors directly into clothing using a clever combo of 3D printing and embroidery.
Unlock accurate monocular 3D object tracking with minimal annotation: Sparse3DTrack achieves state-of-the-art performance using only a handful of labels per track.
Robot world models can be significantly improved by directly rewarding them for generating videos that lead to physically plausible robot actions, even if the videos themselves contain visual artifacts.
A national center focused on AI and robotics in medicine could be the key to unlocking the transformative potential of these technologies in healthcare.