Oct 14, 2025arXiv:2510.12040

Uncertainty Quantification for Hallucination Detection in Large Language Models: Foundations, Methodology, and Future Directions

Sungmin Kang, Y. Bakman, D. Yaldiz, Baturalp Buyukates, A. Avestimehr

AI Summary

This paper surveys uncertainty quantification (UQ) methods for detecting hallucinations in large language models (LLMs), focusing on adapting traditional UQ concepts like epistemic and aleatoric uncertainty to the LLM context. It categorizes existing UQ-based hallucination detection methods along multiple dimensions and provides empirical results for representative approaches. The authors also identify limitations and suggest future research directions for improving the reliability and trustworthiness of LLMs.

Key Contribution

Quantifying uncertainty in LLMs offers a promising path to detecting and mitigating hallucinations, but current methods still face significant limitations.

Abstract

The rapid advancement of large language models (LLMs) has transformed the landscape of natural language processing, enabling breakthroughs across a wide range of areas including question answering, machine translation, and text summarization. Yet, their deployment in real-world applications has raised concerns over reliability and trustworthiness, as LLMs remain prone to hallucinations that produce plausible but factually incorrect outputs. Uncertainty quantification (UQ) has emerged as a central research direction to address this issue, offering principled measures for assessing the trustworthiness of model generations. We begin by introducing the foundations of UQ, from its formal definition to the traditional distinction between epistemic and aleatoric uncertainty, and then highlight how these concepts have been adapted to the context of LLMs. Building on this, we examine the role of UQ in hallucination detection, where quantifying uncertainty provides a mechanism for identifying unreliable generations and improving reliability. We systematically categorize a wide spectrum of existing methods along multiple dimensions and present empirical results for several representative approaches. Finally, we discuss current limitations and outline promising future research directions, providing a clearer picture of the current landscape of LLM UQ for hallucination detection.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations4

Influential citations0

References75

Year2025

VenuearXiv.org

Related Papers

Finding related papers...

Search

Uncertainty Quantification for Hallucination Detection in Large Language Models: Foundations, Methodology, and Future Directions

Related Papers