Search papers, labs, and topics across Lattice.
This paper introduces operads as a rigorous mathematical framework for question decomposition in large language models (LLMs), addressing the lack of foundational structure in current methodologies. By defining the questions operad \( Q \), the authors illustrate how question templates and sub-answer substitutions can be modeled, leading to a new metric called operadic consistency that assesses the coherence of answers across different decompositions. Empirical results indicate that operadic consistency is strongly correlated with accuracy in multi-hop question answering, outperforming traditional self-consistency methods, thereby enhancing the reliability of LLM reasoning.
Operads could revolutionize how we understand and improve multi-step reasoning in LLMs by providing a robust mathematical framework that enhances answer consistency.
Question decomposition, i.e. breaking a complex query into simpler sub-queries whose answers are composed to produce a final answer, is a widely used strategy for improving LLM reasoning, yet it currently lacks a rigorous mathematical foundation. In this paper, we propose operads, mathematical structures that model many-in, one-out operations and compositions thereof, as a natural framework for describing question decomposition. We define the questions operad $Q$, in which operations correspond to question templates and composition corresponds to substitution of sub-answers, and show how QA models can be interpreted as algebras over $Q$. Beyond reframing existing practice, this operadic perspective points toward new methods, in particular a notion of operadic consistency, which measures whether a QA model's answers agree across the partial collapses of a question decomposition tree. Empirical evaluation of operadic consistency is reported in our companion paper (Bottman, Liu, and Richardson, 2026), which finds it strongly correlated with accuracy across twelve LLMs and four multi-hop QA datasets and outperforming standard temperature-based self-consistency baselines. We argue that operads are the natural mathematical home for question decomposition, and that invariants such as operadic consistency open new directions for analyzing and improving the reliability of multi-step reasoning.