Feb 17, 2026arXiv:2602.15725

Recursive Concept Evolution for Compositional Reasoning in Large Language Models

AI Summary

The paper introduces Recursive Concept Evolution (RCE), a framework that allows LLMs to modify their internal representation geometry during inference to improve compositional reasoning. RCE dynamically generates low-rank concept subspaces when representational inadequacy is detected, selects them based on minimum description length, merges synergistic subspaces, and consolidates them via constrained optimization. Integrating RCE with Mistral-7B demonstrates significant performance gains on compositional reasoning benchmarks like ARC-AGI-2, GPQA, BBH, MATH, and HLE.

Key Contribution

LLMs can learn entirely new reasoning abstractions on the fly, boosting performance on compositional tasks by up to 18 points, without retraining.

Abstract

Large language models achieve strong performance on many complex reasoning tasks, yet their accuracy degrades sharply on benchmarks that require compositional reasoning, including ARC-AGI-2, GPQA, MATH, BBH, and HLE. Existing methods improve reasoning by expanding token-level search through chain-of-thought prompting, self-consistency, or reinforcement learning, but they leave the model's latent representation space fixed. When the required abstraction is not already encoded in this space, performance collapses. We propose Recursive Concept Evolution (RCE), a framework that enables pretrained language models to modify their internal representation geometry during inference. RCE introduces dynamically generated low-rank concept subspaces that are spawned when representational inadequacy is detected, selected through a minimum description length criterion, merged when synergistic, and consolidated via constrained optimization to preserve stability. This process allows the model to construct new abstractions rather than recombining existing ones. We integrate RCE with Mistral-7B and evaluate it across compositional reasoning benchmarks. RCE yields 12-18 point gains on ARC-AGI-2, 8-14 point improvements on GPQA and BBH, and consistent reductions in depth-induced error on MATH and HLE.

Architecture Design (Transformers, SSMs, MoE)Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Recursive Concept Evolution for Compositional Reasoning in Large Language Models

Related Papers