Mar 2, 2026arXiv:2603.02099

Recursive Think-Answer Process for LLMs and VLMs

Byung-Kwan Lee, Youngchae Chee, Yong Man Ro

AI Summary

The paper introduces Recursive Think-Answer Process (R-TAP), an iterative reasoning framework for LLMs and VLMs designed to improve accuracy by enabling models to self-reflect and refine their answers. R-TAP uses a confidence generator to evaluate the certainty of model responses and guides subsequent reasoning iterations, optimizing for both increased confidence across iterations and high final answer confidence. Experiments demonstrate that R-TAP significantly outperforms single-pass methods, leading to more stable and faster inference by reducing the frequency of self-reflective cues like "Oops!".

Key Contribution

LLMs can iteratively refine their reasoning and reduce errors by recursively evaluating and improving their own confidence, leading to more stable and faster inference.

Abstract

Think-Answer reasoners such as DeepSeek-R1 have made notable progress by leveraging interpretable internal reasoning. However, despite the frequent presence of self-reflective cues like "Oops!", they remain vulnerable to output errors during single-pass inference. To address this limitation, we propose an efficient Recursive Think-Answer Process (R-TAP) that enables models to engage in iterative reasoning cycles and generate more accurate answers, going beyond conventional single-pass approaches. Central to this approach is a confidence generator that evaluates the certainty of model responses and guides subsequent improvements. By incorporating two complementary rewards-Recursively Confidence Increase Reward and Final Answer Confidence Reward-we show that R-TAP-enhanced models consistently outperform conventional single-pass methods for both large language models (LLMs) and vision-language models (VLMs). Moreover, by analyzing the frequency of "Oops"-like expressions in model responses, we find that R-TAP-applied models exhibit significantly fewer self-reflective patterns, resulting in more stable and faster inference-time reasoning. We hope R-TAP pave the way evolving into efficient and elaborated methods to refine the reasoning processes of future AI.

Multimodal Models Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Recursive Think-Answer Process for LLMs and VLMs

Related Papers