Mar 19, 2026arXiv:2603.18792

Rethinking Uncertainty Quantification and Entanglement in Image Segmentation

J. Christensen, Jakob Lønborg Christensen, Vedrana Andersen Dahl, V. Dahl, M. Hannemose, Morten Rieger Hannemose, Anders Bjorholm Dahl, A. B. Dahl, Christian F. Baumgartner

AI Summary

This paper investigates the interaction and entanglement between aleatoric (AU) and epistemic (EU) uncertainty in image segmentation, which is crucial for safety-critical applications. They empirically evaluate various AU-EU model combinations, propose a metric to quantify uncertainty entanglement, and assess their performance on downstream uncertainty quantification tasks like out-of-distribution detection and calibration. Their results show that ensembles exhibit lower entanglement and superior OOD detection, while the best models for ambiguity modeling and calibration are dataset-dependent, with softmax ensembles performing well across tasks.

Key Contribution

Decomposing uncertainty into aleatoric and epistemic components in image segmentation is often misleading due to substantial entanglement, but ensembles offer a surprisingly robust and less entangled alternative.

Abstract

Uncertainty quantification (UQ) is crucial in safety-critical applications such as medical image segmentation. Total uncertainty is typically decomposed into data-related aleatoric uncertainty (AU) and model-related epistemic uncertainty (EU). Many methods exist for modeling AU (such as Probabilistic UNet, Diffusion) and EU (such as ensembles, MC Dropout), but it is unclear how they interact when combined. Additionally, recent work has revealed substantial entanglement between AU and EU, undermining the interpretability and practical usefulness of the decomposition. We present a comprehensive empirical study covering a broad range of AU-EU model combinations, propose a metric to quantify uncertainty entanglement, and evaluate both across downstream UQ tasks. For out-of-distribution detection, ensembles exhibit consistently lower entanglement and superior performance. For ambiguity modeling and calibration the best models are dataset-dependent, with softmax/SSN-based methods performing well and Probabilistic UNets being less entangled. A softmax ensemble fares remarkably well on all tasks. Finally, we analyze potential sources of uncertainty entanglement and outline directions for mitigating this effect.

Computer Vision Scientific Discovery & Drug Design

Citation Metrics

Citations0

Influential citations0

References34

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Rethinking Uncertainty Quantification and Entanglement in Image Segmentation

Related Papers