Search papers, labs, and topics across Lattice.
4
0
5
1
Overcoming occlusion in hand-object pose estimation just got easier: GenHOI leverages hierarchical semantic knowledge and hand priors to achieve state-of-the-art results on challenging benchmarks.
Pixel-perfect geospatial reasoning is now possible, thanks to a vision-language model that adaptively fuses multi-modal and multi-temporal Earth observation data.
Existing affordance prediction models fall flat when confronted with the wide-angle, distorted reality of panoramic vision, but a new training-free pipeline called PAP rises to the challenge.
Pre-trained video diffusion models can be deterministically adapted into state-of-the-art zero-shot depth estimators, sidestepping the need for massive labeled datasets.