Search papers, labs, and topics across Lattice.
This study critiques existing explainable AI (XAI) methods that primarily produce static feature importance lists, arguing for the necessity of narrative-based explanations to enhance human understanding. By integrating insights from social sciences and linguistics, the authors identify four key properties of effective narrative explanations: continuous structure, cause-effect mechanisms, linguistic fluency, and lexical diversity. They introduce seven novel metrics to evaluate narrative quality and demonstrate that these metrics outperform traditional NLP metrics in distinguishing between descriptive and narrative explanations across multiple datasets.
Narrative-based explanations in XAI could dramatically improve human comprehension of model predictions, surpassing traditional static feature lists.
Explainable AI (XAI) aims to make the behaviour of machine learning models interpretable, yet many explanation methods remain difficult to understand. The integration of Natural Language Generation into XAI aims to deliver explanations in textual form, making them more accessible to practitioners. Current approaches, however, largely yield static lists of feature importances. Although such explanations indicate what influences the prediction, they do not explain why the prediction occurs. In this study, we draw on insights from social sciences and linguistics, and argue that XAI explanations should be presented in the form of narratives. Narrative explanations support human understanding through four defining properties: continuous structure, cause-effect mechanisms, linguistic fluency, and lexical diversity. We show that standard Natural Language Processing (NLP) metrics based solely on token probability or word frequency fail to capture these properties and can be matched or exceeded by tautological text that conveys no explanatory content. To address this issue, we propose seven automatic metrics that quantify the narrative quality of explanations along the four identified dimensions. We benchmark current state-of-the-art explanation generation methods on six datasets and show that the proposed metrics separate descriptive from narrative explanations more reliably than standard NLP metrics. Finally, to further advance the field, we propose a set of problem-agnostic XAI Narrative generation rules for producing natural language XAI explanations, so that the resulting XAI Narratives exhibit stronger narrative properties and align with the findings from the linguistic and social science literature.