Search papers, labs, and topics across Lattice.
This paper addresses the challenge of Implicit Discourse Relation Recognition (IDRR) by distilling the reasoning capabilities of LLMs into smaller IDRR models. They prompt an LLM to generate natural language explanations for IDRR instances conditioned on gold labels and then train a classification-generation framework with the LLM-generated explanations as additional supervision. Experiments on the Penn Discourse Treebank (PDTB) show significant improvements in IDRR performance and enhanced model interpretability through the generated explanations, with further validation on sentiment classification and NLI tasks.
Unlock better discourse understanding: distilling LLM reasoning into smaller models boosts IDRR performance and interpretability using generated explanations.
Implicit Discourse Relation Recognition (IDRR) remains a challenging task due to the requirement for deep semantic understanding in the absence of explicit discourse markers. A further limitation is that existing methods only predict relations without providing any supporting explanations. Recent advances in large language models (LLMs) have shown strong reasoning capabilities in both deep language understanding and natural language explanation generation. In this work, we propose a simple yet effective approach to distill the reasoning capabilities of LLMs into lightweight IDRR models to improve both performance and interpretability. Specifically, we first prompt an LLM to generate explanations for each training instance conditioned on its gold label. Then, we introduce a novel classification-generation framework that jointly performs relation prediction and explanation generation, and train it with the additional supervision of LLM-generated explanations. Our framework is plug-and-play, enabling easy integration with most existing IDRR models. Experimental results on PDTB demonstrate that our approach significantly improves IDRR performance, while human evaluation further confirms that the generated explanations enhance model interpretability. Furthermore, we validate the generality of our approach on sentiment classification and natural language inference