Mar 15, 2026arXiv:2603.14458

Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs

Auksarapak Kietkajornrit, Jad Tarifi, Nima Asgharbeygi

AI Summary

This paper introduces a modular framework for fact-seeking question answering that separates planning from factual retrieval and answer synthesis. A lightweight student planner is trained using a teacher-student approach to generate structured decompositions of reasoning steps and fact requests, supervised only on planning traces and fact requests. Experiments on the SEAL-0 benchmark demonstrate that this supervised planning approach improves both accuracy and latency compared to monolithic reasoning models and prompt-based tool-augmented frameworks.

Key Contribution

Explicitly training a lightweight planner to decompose reasoning steps and fact requests dramatically improves the accuracy and latency of fact-seeking LLMs compared to monolithic or prompt-engineered approaches.

Abstract

Fact-seeking question answering with large language models (LLMs) remains unreliable when answers depend on up-to-date or conflicting information. Although retrieval-augmented and tool-using LLMs reduce hallucinations, they often rely on implicit planning, leading to inefficient tool usage. We propose a modular framework that explicitly separates planning from factual retrieval and answer synthesis. A lightweight student planner is trained via a teacher-student framework to generate structured decompositions consisting of abstract reasoning steps and searchable fact requests. The supervision signals contain only planning traces and fact requests, without providing factual answers or retrieved evidence. At inference, the planner produces plans, while prompt-engineered modules perform retrieval and response synthesis. We evaluate the proposed framework on SEAL-0, an extremely challenging benchmark for search-augmented LLMs. Results show that supervised planning improves both accuracy and latency compared to monolithic reasoning models and prompt-based tool-augmented frameworks, demonstrating that explicitly learned planning structures are essential for reliable fact-seeking LLMs.

Inference & Quantization Reasoning & Chain-of-Thought Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs

Related Papers