D. Paudel

Papers on Lattice

Total citations

Topics

h-index

Research focus

Computer Vision (4)Architecture Design (Transformers, SSMs, MoE) (2)Multimodal Models (2)Training Efficiency & Optimization (1)Robotics & Embodied AI (1)

Frequent co-authors

L. V. Gool (5)Yan Shu (2)Zhitong Xiong (2)B. Demir (2)

Papers (5)

2025

Yan Shu +72025·also Trento

EarthMind: Towards Multi-Granular and Multi-Sensor Earth Observation with Large Multimodal Models

This work presents EarthMind, a novel vision-language framework for multi-granular and multi-sensor EO data understanding and outperforms existing methods on multiple public EO benchmarks, showcasing its potential to handle both multi-granular and multi-sensor challenges in a unified framework.

Yan Shu, B. Ren, Zhitong Xiong +57

Apr 22, 2026

Nedyalko Prisadnikov +3Apr 22, 2026·also Sofia University "St. Kliment Ohridski"

Self-supervised pretraining for an iterative image size agnostic vision transformer

Image-size agnostic vision transformers are now a practical reality, thanks to a new self-supervised pretraining method that maintains constant computational cost regardless of input resolution.

Nedyalko Prisadnikov, D. Paudel, Yuqian Fu +1

Architecture Design (Transformers, SSMs, MoE)Computer Vision Training Efficiency & Optimization

Apr 1, 2026

Yuheng Zhang +9Apr 1, 2026·also Hunan, SJTU, Sofia University "St. Kliment Ohridski"

ProOOD: Prototype-Guided Out-of-Distribution 3D Occupancy Prediction

Autonomous vehicles can now better identify the unexpected, thanks to a new method that boosts out-of-distribution detection by up to 20% without retraining.

Yuheng Zhang, Mengfei Duan, Kunyu Peng +7

Computer Vision Robotics & Embodied AI

Mar 12, 2026

Sofia University "St. Kliment Ohridski"Mar 12, 2026

OSM-based Domain Adaptation for Remote Sensing VLMs

Forget expensive teacher models and manual labeling: a base VLM paired with OpenStreetMap data can annotate itself for remote sensing tasks, achieving state-of-the-art performance at a fraction of the cost.

Stefan Maria Ailuro, Mario Markov, Mohammad Mahdi +3

Computer Vision Data Curation & Synthetic Data Multimodal Models

Jun 2, 2025

EarthMind: Leveraging Cross-Sensor Data for Advanced Earth Observation Interpretation with a Unified Multimodal LLM

EarthMind demonstrates that hierarchical cross-modal attention across optical and SAR data significantly boosts MLLM performance on Earth Observation tasks, outperforming models limited to single-sensor inputs.

Yan Shu, Bin Ren, Zhitong Xiong +5

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Search

D. Paudel

Research focus

Frequent co-authors

Papers (5)