Search papers, labs, and topics across Lattice.
The paper introduces FSCDiff, a novel Fourier-Spatial Entangled Conditional Diffusion model, to improve the accuracy and robustness of underwater salient object detection (USOD). FSCDiff addresses limitations of existing spatial-domain methods by incorporating Fourier-domain information and leveraging the iterative generation capabilities of diffusion models to handle insufficient representation and boundary shift issues. Experiments on USOD10K and USOD datasets demonstrate that FSCDiff outperforms state-of-the-art USOD methods.
By cleverly fusing Fourier and spatial domain information within a diffusion framework, FSCDiff significantly boosts the accuracy of underwater salient object detection, outperforming existing RGB-D methods.
Salient object detection (SOD) plays a crucial role in image understanding and visual guidance. However, due to the complexity of underwater environments, the accuracy of underwater salient object detection is often low. To improve the accuracy and robustness of underwater salient object detection, different from the existing spatial domain aware RGB-D methods that rely on pixel-level probabilities, we propose a novel Fourier-Spatial Entangled Conditional Diffusion model (FSCDiff) for underwater salient object detection. The FSCDiff aims to address the insufficient representation and boundary shift issues in underwater salient object detection by leveraging Fourier-domain information and the powerful multi-step iterative generation capability of diffusion models. The FSCDiff framework consists of two key components: the Dual-Domain Entanglement Enhancement Block (DTEB) and the Stable Time-step Mask Prediction Module (STMP). DTEB utilizes Fourier-spatial entanglement learning to fully exploit the Fourier and spatial domain information of RGB images and depth maps, thereby optimizing feature representation. STMP takes advantage of the excellent multi-step iterative mechanism of diffusion models to enhance the accuracy and robustness of the segmentation results. Comprehensive experimental results indicate that our FSCDiff method outperforms the state-of-the-art approaches on the USOD10K and USOD datasets. The source code is available at: https://github.com/lgwplay/FSCDiff.