Search papers, labs, and topics across Lattice.
University of Trento, Trento, Italy
4
7
5
31
This work presents EarthMind, a novel vision-language framework for multi-granular and multi-sensor EO data understanding and outperforms existing methods on multiple public EO benchmarks, showcasing its potential to handle both multi-granular and multi-sensor challenges in a unified framework.
End-to-end autonomous driving can ditch expert demonstrations and still achieve state-of-the-art performance, thanks to a risk-aware world model that learns to predict and avoid hazardous outcomes.
Stop wasting compute on hallucinated images: a new method can detect and correct object omissions in diffusion models *before* they finish generating.
EarthMind demonstrates that hierarchical cross-modal attention across optical and SAR data significantly boosts MLLM performance on Earth Observation tasks, outperforming models limited to single-sensor inputs.