Apr 9, 2026arXiv:2604.08454

Less Approximates More: Harmonizing Performance and Confidence Faithfulness via Hybrid Post-Training for High-Stakes Tasks

Haokai Ma, Haokai Ma, Lee Yan Zhen, Lee Yan Zhen, Gang Yang, Gang Yang, Yunshan Ma, Yunshan Ma, Ee-Chien Chang, Tat-Seng Chua

AI Summary

The paper introduces HyTuning, a hybrid post-training framework that adaptively combines Reasoning Distillation (RD) and Reinforcement Learning from Internal Feedback (RLIF) to improve both accuracy and confidence faithfulness in LLMs for high-stakes tasks. HyTuning uses Progressive Reasoning Gain (PRG) to measure the progressive support for the final answer within reasoning traces, allowing for adaptive reweighting of RD and RLIF. Experiments on domain-specific and general benchmarks show that HyTuning improves accuracy and confidence faithfulness, even with limited supervised reasoning traces.

Key Contribution

LLMs can be made more accurate *and* more trustworthy with a clever post-training method that selectively amplifies only the reasoning steps that progressively build confidence in the correct answer.

Abstract

Large language models are increasingly deployed in high-stakes tasks, where confident yet incorrect inferences may cause severe real-world harm, bringing the previously overlooked issue of confidence faithfulness back to the forefront. A promising solution is to jointly optimize unsupervised Reinforcement Learning from Internal Feedback (RLIF) with reasoning-trace-guided Reasoning Distillation (RD), which may face three persistent challenges: scarcity of high-quality training corpora, factually unwarranted overconfidence and indiscriminate fusion that amplifies erroneous updates. Inspired by the human confidence accumulation from uncertainty to certainty, we propose Progressive Reasoning Gain (PRG) to measure whether reasoning steps progressively strengthen support for the final answer. Furthermore, we introduce HyTuning, a hybrid post-training framework that adaptively reweights RD and RLIF via a PRG-style metric, using scarce supervised reasoning traces as a stable anchor while exploiting abundant unlabeled queries for scalability. Experiments on several domain-specific and general benchmarks demonstrate that HyTuning improves accuracy while achieving confidence faithfulness under limited supervision, supporting a practical"Less Approximates More"effect.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought RLHF & Preference Learning

Citation Metrics

Citations0

Influential citations0

References40

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Less Approximates More: Harmonizing Performance and Confidence Faithfulness via Hybrid Post-Training for High-Stakes Tasks

Related Papers