Search papers, labs, and topics across Lattice.
×10−44\times 10^{-4}, with 30k and 20k training iterations, respectively. The batch size is set to 8, and AdamW [32] is adopted for optimization. Image resolutions follow prior works [51, 13]. Table 2: Comparison of zero-shot performance with other state-of-the-art methods on the Pascal-Part-116 and ADE
2
0
4
By disentangling semantic and contextual cues in vision-language models, PCA-Seg achieves state-of-the-art open-vocabulary segmentation with only 0.35M additional parameters per block.
By injecting real-world priors into a diffusion model, Iris achieves state-of-the-art monocular depth estimation with significantly improved generalization and detail, even with limited training data.