Search papers, labs, and topics across Lattice.
This paper introduces an O-shaped architecture, O-Transformer-Mamba, for remote sensing image dehazing, addressing limitations of Transformers and Mamba models in handling uneven haze distribution and fine-grained spatial awareness, respectively. The architecture combines a Sparse-Enhanced Self-Attention (SE-SA) module with a Mixed Visual State Space Model (Mix-VSSM) to balance haze-sensitive details with long-range context modeling. Experiments demonstrate that the proposed framework outperforms existing dehazing methods on benchmark datasets.
By adaptively focusing on haze-affected regions and preserving spatial details, O-Transformer-Mamba significantly improves remote sensing image dehazing compared to existing Transformer or Mamba-based approaches.
Although Transformer-based and state-space models (e.g., Mamba) have demonstrated impressive performance in image restoration, they remain deficient in remote sensing image dehazing. Transformer-based models tend to distribute attention evenly, making them difficult to handle the uneven distribution of haze. While Mamba excels at modeling long-range dependencies, it lacks fine-grained spatial awareness of complex atmospheric scattering. To overcome these limitations, we present a new O-shaped dehazing architecture that combines a Sparse-Enhanced Self-Attention (SE-SA) module with a Mixed Visual State Space Model (Mix-VSSM), balancing haze-sensitive details in remote sensing images with long-range context modeling. The SE-SA module introduces a dynamic soft masking mechanism that adaptively adjusts attention weights based on the local haze distribution, enabling the network to more effectively focus on severely degraded regions while suppressing redundant responses. Furthermore, the Mix-VSSM enhances global context modeling by combining sequential processing of 2D perception with local residual information. This design mitigates the loss of spatial detail in the standard VSSM and improves the feature representation of haze-degraded remote sensing images. Thorough experiments verify that our O-shaped framework outperforms existing methods on several benchmark datasets.