Mar 10, 2026arXiv:2603.09530

DCAU-Net: Differential Cross Attention and Channel-Spatial Feature Fusion for Medical Image Segmentation

AI Summary

The paper introduces DCAU-Net, a novel medical image segmentation framework designed to improve accuracy and efficiency by addressing limitations in both transformer-based and CNN-based approaches. It employs a Differential Cross Attention (DCA) mechanism that computes the difference between softmax attention maps using window-level tokens to reduce computational complexity while highlighting discriminative structures. Additionally, a Channel-Spatial Feature Fusion (CSFF) strategy adaptively recalibrates features from skip connections using sequential channel and spatial attention.

Key Contribution

By computing the *difference* between attention maps, DCAU-Net achieves state-of-the-art medical image segmentation while dramatically reducing computational cost compared to standard self-attention.

Abstract

Accurate medical image segmentation requires effective modeling of both long-range dependencies and fine-grained boundary details. While transformers mitigate the issue of insufficient semantic information arising from the limited receptive field inherent in convolutional neural networks, they introduce new challenges: standard self-attention incurs quadratic computational complexity and often assigns non-negligible attention weights to irrelevant regions, diluting focus on discriminative structures and ultimately compromising segmentation accuracy. Existing attention variants, although effective in reducing computational complexity, fail to suppress redundant computation and inadvertently impair global context modeling. Furthermore, conventional fusion strategies in encoder-decoder architectures, typically based on simple concatenation or summation, can not adaptively integrate high-level semantic information with low-level spatial details. To address these limitations, we propose DCAU-Net, a novel yet efficient segmentation framework with two key ideas. First, a new Differential Cross Attention (DCA) is designed to compute the difference between two independent softmax attention maps to adaptively highlight discriminative structures. By replacing pixel-wise key and value tokens with window-level summary tokens, DCA dramatically reduces computational complexity without sacrificing precision. Second, a Channel-Spatial Feature Fusion (CSFF) strategy is introduced to adaptively recalibrate features from skip connections and up-sampling paths through using sequential channel and spatial attention, effectively suppressing redundant information and amplifying salient cues. Experiments on two public benchmarks demonstrate that DCAU-Net achieves competitive performance with enhanced segmentation accuracy and robustness.

Architecture Design (Transformers, SSMs, MoE)Computer Vision

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DCAU-Net: Differential Cross Attention and Channel-Spatial Feature Fusion for Medical Image Segmentation

Related Papers