Search papers, labs, and topics across Lattice.
This paper investigates the relationship between LLM explanations, actions, and underlying understanding, arguing that LLMs can exhibit coherent explanations without grounding in reality and vice versa. Through experiments in compiler optimization and hyperparameter tuning, the author demonstrates a "Bidirectional Coherence Paradox" where LLMs succeed in low-observability domains while misidentifying mechanisms, and fail in high-observability domains despite accurate explanations. The paper introduces the Epistemic Triangle framework to model the interaction of priors, signals, and domain knowledge, ultimately challenging the assumption that coherence implies understanding in LLMs.
LLMs can be confidently wrong about *why* they succeed, and accurately explain failures they can't fix, revealing a fundamental disconnect between explanation and competence.
When an agent can articulate why something works, we typically take this as evidence of genuine understanding. This presupposes that effective action and correct explanation covary, and that coherent explanation reliably signals both. I argue that this assumption fails for contemporary Large Language Models (LLMs). I introduce what I call the Bidirectional Coherence Paradox: competence and grounding not only dissociate but invert across epistemic conditions. In low-observability domains, LLMs often act successfully while misidentifying the mechanisms that produce their success. In high-observability domains, they frequently generate explanations that accurately track observable causal structure yet fail to translate those diagnoses into effective intervention. In both cases, explanatory coherence remains intact, obscuring the underlying dissociation. Drawing on experiments in compiler optimization and hyperparameter tuning, I develop the Epistemic Triangle, a model of how priors, signals, and domain knowledge interact under varying observability. The results suggest that neither behavioral success nor explanatory accuracy alone suffices for attributing understanding. I argue that evaluating artificial epistemic agents requires a tripartite framework -- coherence, grounding, and a proper basing relation linking explanation to action. The systematic separation of knowing-that and knowing-how in LLMs thus challenges assumptions inherited from both epistemology and current AI evaluation practice.