Search papers, labs, and topics across Lattice.
This paper introduces Structure-Aware Latent Diffusion (SALD), an edge-cloud collaborative super-resolution framework designed to address bandwidth limitations in remote sensing data transmission. SALD decouples imagery into a compressed low-frequency payload and a lightweight structural prior at the edge, transmitting this representation to minimize bandwidth consumption. On the cloud side, a Structure-Gated Large Kernel module and Semantic-Guidance Engine within a diffusion backbone leverage the structural priors to enhance long-range dependencies and suppress hallucinations, leading to improved perceptual quality and downstream task performance.
Overcome the bandwidth bottleneck in remote sensing with a collaborative edge-cloud approach that transmits structural priors, enabling high-fidelity super-resolution and boosting downstream perception tasks even under extreme compression.
The exponential surge in high-resolution remote sensing data faces a severe bottleneck in satellite-to-ground transmission. Limited downlink bandwidth forces the use of extreme high-ratio compression, which irreversibly destroys high-frequency structural details essential for downstream machine perception tasks like object detection. While current super-resolution techniques attempt to recover these details, regression-based methods often yield over-smoothed textures, and generative diffusion models frequently introduce structural hallucinations that mislead detection systems. To address this trade-off, we propose the Structure-Aware Latent Diffusion (SALD) framework, an asymmetric edge-cloud collaborative SR system. At the resource-constrained edge, the system decouples imagery into a highly compressed low-frequency payload and a lightweight soft structural prior. Transmitting this decoupled representation minimizes bandwidth consumption. On the powerful cloud side, we introduce a Structure-Gated Large Kernel (SGLK) module and a Semantic-Guidance Engine (SGE) within the diffusion backbone. These modules leverage the transmitted structural priors to gate large-kernel convolutions, effectively capturing long-range dependencies inherent in aerial scenes while actively suppressing generative hallucinations. Extensive experiments on both the MSCM and UCMerced datasets demonstrate that, even under extreme bandwidth constraints, SALD achieves superior perceptual quality (LPIPS) and significantly enhances downstream performance in both scene classification and small-target detection.