Max PlanckUniversity of Technology Nuremberg PointVIA Research CenterApr 13, 2026arXiv:2604.11170

Do Instance Priors Help Weakly Supervised Semantic Segmentation?

Anurag Das, Anna Kukleva, Xinting Hu, Yuki M. Asano, Bernt Schiele

AI Summary

This paper introduces SeSAM, a framework that adapts the Segment Anything Model (SAM) for weakly supervised semantic segmentation by addressing challenges in using instance-based SAM for class-based segmentation. SeSAM decomposes class masks, samples point prompts along object skeletons, selects SAM masks using weak-label coverage, and iteratively refines labels using pseudo-labels. Experiments demonstrate that SeSAM outperforms weakly supervised baselines across multiple benchmarks and weak annotation types, significantly reducing annotation costs.

Key Contribution

SAM, designed for instance segmentation, can be surprisingly effective for semantic segmentation with weak supervision when adapted with techniques like skeleton-based prompting and iterative pseudo-label refinement.

Abstract

Semantic segmentation requires dense pixel-level annotations, which are costly and time-consuming to acquire. To address this, we present SeSAM, a framework that uses a foundational segmentation model, i.e. Segment Anything Model (SAM), with weak labels, including coarse masks, scribbles, and points. SAM, originally designed for instance-based segmentation, cannot be directly used for semantic segmentation tasks. In this work, we identify specific challenges faced by SAM and determine appropriate components to adapt it for class-based segmentation using weak labels. Specifically, SeSAM decomposes class masks into connected components, samples point prompts along object skeletons, selects SAM masks using weak-label coverage, and iteratively refines labels using pseudo-labels, enabling SAM-generated masks to be effectively used for semantic segmentation. Integrated with a semi-supervised learning framework, SeSAM balances ground-truth labels, SAM-based pseudo-labels, and high-confidence pseudo-labels, significantly improving segmentation quality. Extensive experiments across multiple benchmarks and weak annotation types show that SeSAM consistently outperforms weakly supervised baselines while substantially reducing annotation cost relative to fine supervision.

Computer Vision Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Do Instance Priors Help Weakly Supervised Semantic Segmentation?

Related Papers