Feb 25, 2026arXiv:2602.21712

Innovative Tooth Segmentation Using Hierarchical Features and Bidirectional Sequence Modeling

Xinxin Zhao, Jian Jiang, Yan Tian, Liqin Wu, Zhaocheng Xu, Teddy Yang, Yunuo Zou, Xun Wang

AI Summary

This paper introduces a novel three-stage encoder with hierarchical feature representation for tooth image segmentation to address the limitations of fixed-resolution feature maps and the computational cost of transformer-based self-attention. The encoder captures scale-adaptive information and fuses cross-scale features to preserve fine structural information and contextual awareness. By incorporating a bidirectional sequence modeling strategy, the model enhances global spatial context understanding, achieving a 1.1% mIoU improvement on the OralVision dataset.

Key Contribution

Ditch the quadratic complexity of transformers for high-resolution dental images: this new encoder uses bidirectional sequence modeling to enhance global spatial context understanding without the computational cost.

Abstract

Tooth image segmentation is a cornerstone of dental digitization. However, traditional image encoders relying on fixed-resolution feature maps often lead to discontinuous segmentation and poor discrimination between target regions and background, due to insufficient modeling of environmental and global context. Moreover, transformer-based self-attention introduces substantial computational overhead because of its quadratic complexity (O(n^2)), making it inefficient for high-resolution dental images. To address these challenges, we introduce a three-stage encoder with hierarchical feature representation to capture scale-adaptive information in dental images. By jointly leveraging low-level details and high-level semantics through cross-scale feature fusion, the model effectively preserves fine structural information while maintaining strong contextual awareness. Furthermore, a bidirectional sequence modeling strategy is incorporated to enhance global spatial context understanding without incurring high computational cost. We validate our method on two dental datasets, with experimental results demonstrating its superiority over existing approaches. On the OralVision dataset, our model achieves a 1.1% improvement in mean intersection over union (mIoU).

Architecture Design (Transformers, SSMs, MoE)Computer Vision

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Innovative Tooth Segmentation Using Hierarchical Features and Bidirectional Sequence Modeling

Related Papers