Search papers, labs, and topics across Lattice.
The paper introduces EarthSynth, a diffusion-based generative model for synthesizing multi-category, cross-satellite labeled Earth observation data to address the scarcity of labeled remote sensing imagery (RSI). EarthSynth is trained on the EarthSynth-180K dataset using a Counterfactual Composition training strategy with a 3D batch-sample selection mechanism and R-Filter to improve data diversity and informativeness. Experiments on scene classification, object detection, and semantic segmentation demonstrate that EarthSynth significantly improves performance in open-vocabulary understanding tasks.
Overcome remote sensing's labeled data scarcity with EarthSynth, a diffusion model that generates realistic, multi-category, cross-satellite imagery, significantly boosting performance on downstream tasks like scene classification and object detection.
Remote sensing image (RSI) interpretation typically faces challenges due to the scarcity of labeled data, which limits the performance of RSI interpretation tasks. To tackle this challenge, we propose EarthSynth, a diffusion-based generative foundation model that enables synthesizing multi-category, cross-satellite labeled Earth observation for downstream RSI interpretation tasks. To the best of our knowledge, EarthSynth is the first to explore multi-task generation for remote sensing, tackling the challenge of limited generalization in task-oriented synthesis for RSI interpretation. EarthSynth, trained on the EarthSynth-180K dataset, employs the Counterfactual Composition training strategy with a three-dimensional batch-sample selection mechanism to improve training data diversity and enhance category control. Furthermore, a rule-based method of R-Filter is proposed to filter more informative synthetic data for downstream tasks. We evaluate our EarthSynth on scene classification, object detection, and semantic segmentation in open-world scenarios. There are significant improvements in open-vocabulary understanding tasks, offering a practical solution for advancing RSI interpretation.