Search papers, labs, and topics across Lattice.
2
7
3
7
This work presents EarthMind, a novel vision-language framework for multi-granular and multi-sensor EO data understanding and outperforms existing methods on multiple public EO benchmarks, showcasing its potential to handle both multi-granular and multi-sensor challenges in a unified framework.
Pre-trained video diffusion models can be deterministically adapted into state-of-the-art zero-shot depth estimators, sidestepping the need for massive labeled datasets.