Search papers, labs, and topics across Lattice.
3
0
6
7
Unified multimodal models secretly contain separate inference pathways for generation and understanding, and FlashU unlocks this hidden potential for 2x speedup without retraining.
Existing affordance prediction models fall flat when confronted with the wide-angle, distorted reality of panoramic vision, but a new training-free pipeline called PAP rises to the challenge.
Pre-trained video diffusion models can be deterministically adapted into state-of-the-art zero-shot depth estimators, sidestepping the need for massive labeled datasets.