Search papers, labs, and topics across Lattice.
This paper introduces RDNet, a novel salient object detection network for remote sensing images that addresses challenges related to object scale variations and computational costs. RDNet employs a Swin Transformer backbone for global context modeling and incorporates three key modules: a dynamic adaptive detail-aware (DAD) module using varied convolution kernels, a frequency-matching context enhancement (FCE) module, and a region proportion-aware localization (RPL) module. Experimental results demonstrate that RDNet achieves superior detection performance compared to existing state-of-the-art methods by effectively handling scale variations and improving object localization accuracy.
By dynamically adapting convolution kernels based on object region proportions, RDNet achieves state-of-the-art salient object detection in remote sensing images, outperforming existing methods that struggle with scale variations.
Salient object detection (SOD) in remote sensing images faces significant challenges due to large variations in object sizes, the computational cost of self-attention mechanisms, and the limitations of convolutional neural network (CNN)-based extractors in capturing global context and long-range dependencies. Existing methods that rely on fixed convolution kernels often struggle to adapt to diverse object scales, leading to detail loss or irrelevant feature aggregation. To address these issues, this work aims to enhance robustness to scale variations and achieve precise object localization. We propose the region proportion-aware dynamic adaptive SOD network (RDNet), which replaces the CNN backbone with the Swin Transformer for global context modeling and introduces three key modules: 1) the dynamic adaptive detail-aware (DAD) module, which applies varied convolution kernels guided by object region proportions; 2) the frequency-matching context enhancement (FCE) module, which enriches contextual information through wavelet interactions and attention; and 3) the region proportion-aware localization (RPL) module, which employs cross-attention to highlight semantic details and integrates a proportion guidance (PG) block to assist the DAD module. By combining these modules, RDNet achieves robustness against scale variations and accurate localization, delivering superior detection performance compared with state-of-the-art methods.