Feb 23, 2026arXiv:2602.19498

Softmax is not Enough (for Adaptive Conformal Classification)

Navid Akhavan Attar, Hesam Asadollahzadeh, Ling Luo, Uwe Aickelin

AI Summary

The paper addresses the limitation of using softmax outputs in deep conformal classifiers, which can lead to unreliable uncertainty estimates and limit the adaptiveness of prediction sets. They propose reweighting nonconformity scores with a monotonic transformation of the Helmholtz Free Energy, derived from the pre-softmax logit space, to better reflect model uncertainty and sample difficulty. Experiments on multiple datasets and architectures demonstrate that this energy-based enhancement improves the adaptiveness and efficiency of prediction sets compared to baseline nonconformity scores.

Key Contribution

Softmax outputs hamstring conformal prediction's ability to adapt to input difficulty; using Helmholtz Free Energy from the logit space as a nonconformity score re-weighting mechanism unlocks more efficient and adaptive prediction sets.

Abstract

The merit of Conformal Prediction (CP), as a distribution-free framework for uncertainty quantification, depends on generating prediction sets that are efficient, reflected in small average set sizes, while adaptive, meaning they signal uncertainty by varying in size according to input difficulty. A central limitation for deep conformal classifiers is that the nonconformity scores are derived from softmax outputs, which can be unreliable indicators of how certain the model truly is about a given input, sometimes leading to overconfident misclassifications or undue hesitation. In this work, we argue that this unreliability can be inherited by the prediction sets generated by CP, limiting their capacity for adaptiveness. We propose a new approach that leverages information from the pre-softmax logit space, using the Helmholtz Free Energy as a measure of model uncertainty and sample difficulty. By reweighting nonconformity scores with a monotonic transformation of the energy score of each sample, we improve their sensitivity to input difficulty. Our experiments with four state-of-the-art score functions on multiple datasets and deep architectures show that this energy-based enhancement improves the adaptiveness of the prediction sets, leading to a notable increase in both efficiency and adaptiveness compared to baseline nonconformity scores, without introducing any post-hoc complexity.

Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Softmax is not Enough (for Adaptive Conformal Classification)

Related Papers