Notre DameJun 9, 2026arXiv:2606.11512

SAGE: Answer-Conditioned Uncertainty Targets for Verbal Uncertainty Alignment

AI Summary

This paper introduces SAGE, a novel method for aligning verbal uncertainty in large language models by estimating uncertainty targets from repeated model outputs rather than isolated responses. By employing Semantic-Answer Guided Entropy, SAGE creates an answer-conditioned uncertainty geometry that effectively differentiates between categorical, numeric, and symbolic answers while providing a smooth calibration signal. The approach, validated through Group-Uncertainty Preference Optimization (GUPO), demonstrates significant improvements in uncertainty ranking, calibration error, and overconfidence across various reasoning tasks.

Key Contribution

Verbal uncertainty in large language models can be effectively aligned with their sampled behavior, leading to improved calibration and reduced overconfidence.

Abstract

Large language models increasingly express uncertainty through natural-language statements, yet these expressions often fail to reflect the model's sampled behavior. We study verbal uncertainty alignment as a distributional calibration problem: the appropriate uncertainty target for a prompt should be estimated from repeated model outputs rather than from an isolated response. However, group rollouts alone are insufficient, since the resulting target must provide a useful training signal. Existing targets only partially satisfy this requirement. We propose SAGE, Semantic-Answer Guided Entropy, a group-level uncertainty target that constructs an answer-conditioned uncertainty geometry over sampled responses. SAGE preserves categorical, numeric, and symbolic answer distinctions while maintaining a smooth and scale-preserving calibration signal. We further apply this target through Group-Uncertainty Preference Optimization, or GUPO, an uncertainty-channel training framework that supervises verbal uncertainty expressions rather than the full response. Experiments across factual, mathematical, and multiple-choice reasoning tasks show improved uncertainty ranking, lower calibration error, and reduced overconfidence.

Eval Frameworks & Benchmarks Natural Language Processing Scalable Oversight & Alignment Theory

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SAGE: Answer-Conditioned Uncertainty Targets for Verbal Uncertainty Alignment

Related Papers