HKUSTHUSTPolyUUPennApr 15, 2026arXiv:2604.14121

Correct Prediction, Wrong Steps? Consensus Reasoning Knowledge Graph for Robust Chain-of-Thought Synthesis

Zipeng Ling, Zipeng Ling, Shuliang Liu, Shuliang Liu, Shenghong Fu, Yuehao Tang, Seonil Son, Seonil Son, Yao Wan, Xuming Hu, Xuming Hu

AI Summary

The paper introduces CRAFT, a framework to improve LLM reasoning by constructing a Reasoning Knowledge Graph (RKG) from multiple candidate reasoning traces and synthesizing a high-quality trace via topological generation. Unlike directly guiding LLMs with ground-truth labels which yields no improvement, CRAFT mitigates both step-internal (e.g., logical errors) and step-wise flaws (e.g., overthinking) in reasoning traces. Experiments show CRAFT improves label prediction accuracy by 10+% on average and enhances the quality of reasoning traces across logical and mathematical reasoning benchmarks.

Key Contribution

Ground-truth labels don't improve LLM reasoning, but building a consensus-based knowledge graph of reasoning steps does, boosting accuracy by 10%+.

Abstract

LLM reasoning traces suffer from complex flaws -- *Step Internal Flaws* (logical errors, hallucinations, etc.) and *Step-wise Flaws* (overthinking, underthinking), which vary by sample. A natural approach would be to provide ground-truth labels to guide LLMs'reasoning. Contrary to intuition, we show that this yields no improvement in reasoning ability. We then propose CRAFT, a unified framework that mitigates both types of Step flaws, which builds a Reasoning Knowledge Graph (RKG) based on the consensus parts of multiple candidate traces, and synthesizes a high-quality trace through topological generation. Our approach improves label-prediction accuracy by 10+% on average, and consistently outperforms all baselines across both logical and mathematical reasoning benchmarks. Further, detailed benchmark evaluation proves that our method also improves the quality of LLMs'reasoning traces in multiple dimensions.

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Correct Prediction, Wrong Steps? Consensus Reasoning Knowledge Graph for Robust Chain-of-Thought Synthesis

Related Papers