Search papers, labs, and topics across Lattice.
This paper investigates the trade-off between sufficiency and conciseness in LLM self-explanations by framing explanations as compressed representations through the lens of the information bottleneck principle. They introduce an evaluation pipeline that constrains explanation length and assesses sufficiency using multiple language models on the ARC Challenge dataset in both English and Persian. The key finding is that significant explanation length reduction is possible without sacrificing accuracy, although excessive compression does degrade performance.
LLMs can often achieve the same accuracy with significantly shorter self-explanations, suggesting that current chain-of-thought reasoning is unnecessarily verbose.
Large Language Models increasingly rely on self-explanations, such as chain of thought reasoning, to improve performance on multi step question answering. While these explanations enhance accuracy, they are often verbose and costly to generate, raising the question of how much explanation is truly necessary. In this paper, we examine the trade-off between sufficiency, defined as the ability of an explanation to justify the correct answer, and conciseness, defined as the reduction in explanation length. Building on the information bottleneck principle, we conceptualize explanations as compressed representations that retain only the information essential for producing correct answers.To operationalize this view, we introduce an evaluation pipeline that constrains explanation length and assesses sufficiency using multiple language models on the ARC Challenge dataset. To broaden the scope, we conduct experiments in both English, using the original dataset, and Persian, as a resource-limited language through translation. Our experiments show that more concise explanations often remain sufficient, preserving accuracy while substantially reducing explanation length, whereas excessive compression leads to performance degradation.