Search papers, labs, and topics across Lattice.
University of California, Los Angeles
6
0
7
Tactile feedback integration boosts robot manipulation success rates by over 40% in contact-rich tasks, redefining the limits of vision-language-action models.
Robot actions can serve as powerful geometric supervision, enabling sparse 3D representations that are both reusable and effective across diverse manipulation tasks.
Train a text-to-image model that rivals the state-of-the-art with 1/5th the compute by using GPT-4 to generate better captions.
Achieve 50% bitrate savings in ultra-low-bitrate image compression by cleverly turning image decoding into a next-frame prediction problem using video diffusion priors.
Finally, a feed-forward method can generate realistic, simulation-ready garment patterns directly from single "in-the-wild" images, bypassing the need for multi-view inputs or expensive optimization.
Origami, the "Hello, World!" of physical intelligence, is now tractable: Learn2Fold uses LLMs and graph-structured world models to generate valid folding sequences from text.