AI2Jun 11, 2026arXiv:2606.13634

Operads for compositional reasoning in LLMs

Nathaniel Bottman, Nathaniel Bottman, Kyle Richardson, Kyle Richardson

AI Summary

This paper introduces operads as a rigorous mathematical framework for question decomposition in large language models (LLMs), addressing the lack of foundational structure in current methodologies. By defining the questions operad $ Q $, the authors illustrate how question templates and sub-answer substitutions can be modeled, leading to a new metric called operadic consistency that assesses the coherence of answers across different decompositions. Empirical results indicate that operadic consistency is strongly correlated with accuracy in multi-hop question answering, outperforming traditional self-consistency methods, thereby enhancing the reliability of LLM reasoning.

Key Contribution

Operads could revolutionize how we understand and improve multi-step reasoning in LLMs by providing a robust mathematical framework that enhances answer consistency.

Abstract

Question decomposition, i.e. breaking a complex query into simpler sub-queries whose answers are composed to produce a final answer, is a widely used strategy for improving LLM reasoning, yet it currently lacks a rigorous mathematical foundation. In this paper, we propose operads, mathematical structures that model many-in, one-out operations and compositions thereof, as a natural framework for describing question decomposition. We define the questions operad $Q$, in which operations correspond to question templates and composition corresponds to substitution of sub-answers, and show how QA models can be interpreted as algebras over $Q$. Beyond reframing existing practice, this operadic perspective points toward new methods, in particular a notion of operadic consistency, which measures whether a QA model's answers agree across the partial collapses of a question decomposition tree. Empirical evaluation of operadic consistency is reported in our companion paper (Bottman, Liu, and Richardson, 2026), which finds it strongly correlated with accuracy across twelve LLMs and four multi-hop QA datasets and outperforming standard temperature-based self-consistency baselines. We argue that operads are the natural mathematical home for question decomposition, and that invariants such as operadic consistency open new directions for analyzing and improving the reliability of multi-step reasoning.

Natural Language Processing Reasoning & Chain-of-Thought

Citation Metrics

Citations1

Influential citations1

References15

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Operads for compositional reasoning in LLMs

Related Papers