Search papers, labs, and topics across Lattice.
The paper introduces HyM3S, a novel hyperspectral image (HSI) classification model that integrates multiscale spatial-spectral convolutions with the Mamba state-space model to capture fine-grained spectral-spatial information while maintaining linear computational complexity. HyM3S extracts multiscale spatial and spectral features, reinforces salient channels with attention, and fuses features adaptively before feeding them into the Mamba module for long-range dependency modeling. Experiments on four benchmark HSI datasets (PaviaU, Houston2013, WHU–Hi–HanChuan, and WHU–Hi–HongHu) demonstrate that HyM3S achieves superior classification accuracy and noise suppression compared to existing methods.
By fusing multiscale spatial-spectral convolutions with Mamba, HyM3S achieves state-of-the-art hyperspectral image classification while maintaining linear computational complexity, overcoming limitations of traditional Transformers.
Transformer models have been widely adopted for hyperspectral image (HSI) classification due to their exceptional long-sequence modeling capabilities. However, the self-attention mechanism in Transformers incurs quadratic computational complexity, posing challenges in both speed and memory consumption. Recently, a novel state-space model—the Mamba model—has emerged, overcoming the quadratic complexity of self-attention by achieving linear computational complexity while retaining powerful long-sequence modeling. Yet, the original Mamba design does not account for the unique spectral–spatial characteristics of HSI data, making it difficult to capture multiscale features. This limitation can lead to the loss of critical spectral–spatial cues at fine targets and complex boundaries, resulting in increased classification noise, blurred boundary segmentation, and reduced overall accuracy. To address the loss of fine–grained spectral–spatial information in HSI, we propose HyM3S: a hyperspectral multiscale spatial–spectral sequence model that integrates multiscale spatial–spectral convolutions with Mamba’s linear sequence modeling.HyM3S first extracts multiscale spatial and spectral features in parallel along horizontal and vertical branches, and reinforces salient channels via channel–wise attention. Features are then adaptively fused across modality and directional dimensions to form a unified joint representation. Finally, this representation is fed into the Mamba module for long–range dependency modeling under linear complexity, thereby significantly improving classification accuracy and suppressing noise. Experiments on four benchmark HSI datasets—Pavia University (PaviaU), Houston2013, WHU–Hi–HanChuan, and WHU–Hi–HongHu—demonstrate the clear superiority of the proposed HyM3S model for HSI classification.