HITPKUUMacauZJUMar 5, 2026arXiv:2603.05120

Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Reasoning

Boren Hu, Xiao Liu, Boci Peng, Xinping Zhao, Xiaoran Shang, Yun Zhu, Lijun Wu

AI Summary

This paper introduces Bidirectional Curriculum Generation, a multi-agent framework that dynamically adjusts the difficulty of mathematical reasoning problems for LLM training. The framework uses a closed feedback loop to either complicate problems to challenge the model or simplify them to address specific reasoning failures. By optimizing the learning trajectory based on the Optimal Pacing Theorem, the approach achieves superior reasoning performance with significantly fewer instruction samples compared to unidirectional curriculum learning baselines.

Key Contribution

LLMs can learn mathematical reasoning far more efficiently by adaptively simplifying problems to address specific weaknesses, rather than just escalating complexity.

Abstract

Enhancing mathematical reasoning in Large Language Models typically demands massive datasets, yet data efficiency remains a critical bottleneck. While Curriculum Learning attempts to structure this process, standard unidirectional approaches (simple-to-complex) suffer from inefficient sample utilization: they blindly escalate complexity even when foundational gaps persist, leading to wasted computation on unsolvable problems. To maximize the instructional value of every training sample, we introduce a novel Bidirectional Curriculum Generation framework. Unlike rigid trajectories, our multi-agent ecosystem mimics adaptive pedagogy to establish a closed feedback loop. It dynamically generates data by either complicating problems to challenge the model or, crucially, simplying them to repair specific reasoning failures. This mechanism ensures that the model consumes only the most effective data at any given stage. Grounded in the Optimal Pacing Theorem, our approach optimizes the learning trajectory, significantly outperforming baselines while achieving superior reasoning performance with substantially fewer instruction samples.

Data Curation & Synthetic Data Reasoning & Chain-of-Thought Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References32

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Reasoning

Related Papers