Search papers, labs, and topics across Lattice.
The paper introduces Fine-Refine, a novel framework to mitigate hallucination in dialogue systems by refining LLM responses at a fine-grained, atomic unit level. Fine-Refine decomposes responses, verifies each unit against external knowledge, evaluates fluency using perplexity, and iteratively corrects errors. Experiments on HybriDialogue and OpendialKG datasets demonstrate that Fine-Refine significantly improves factual accuracy (up to 7.63 points in fact score) with a minor impact on dialogue quality.
Substantially improve the factual accuracy of dialogue LLMs by iteratively refining responses at the atomic unit level, verifying each unit against external knowledge.
The tendency for hallucination in current large language models (LLMs) negatively impacts dialogue systems. Such hallucinations produce factually incorrect responses that may mislead users and undermine system trust. Existing refinement methods for dialogue systems typically operate at the response level, overlooking the fact that a single response may contain multiple verifiable or unverifiable facts. To address this gap, we propose Fine-Refine, a fine-grained refinement framework that decomposes responses into atomic units, verifies each unit using external knowledge, assesses fluency via perplexity, and iteratively corrects granular errors. We evaluate factuality across the HybriDialogue and OpendialKG datasets in terms of factual accuracy (fact score) and coverage (Not Enough Information Proportion), and experiments show that Fine-Refine substantially improves factuality, achieving up to a 7.63-point gain in dialogue fact score, with a small trade-off in dialogue quality.