UPennFeb 19, 2026arXiv:2602.17633

When to Trust the Cheap Check: Weak and Strong Verification for Reasoning

Shayan Kiyani, Sima Noorani, George Pappas, Hamed Hassani

AI Summary

This paper formalizes the tension between weak (cheap, noisy) and strong (expensive, reliable) verification methods used in LLM reasoning loops. It introduces weak-strong verification policies to decide when to accept/reject based on weak verification or defer to strong verification. The authors demonstrate that optimal policies have a two-threshold structure and that calibration and sharpness determine the value of weak verifiers, further developing an online algorithm with provable error control.

Key Contribution

Stop blindly trusting self-consistency: this work reveals how to optimally combine cheap "weak" checks with expensive "strong" verification to improve LLM reasoning.

Abstract

Reasoning with LLMs increasingly unfolds inside a broader verification loop. Internally, systems use cheap checks, such as self-consistency or proxy rewards, which we call weak verification. Externally, users inspect outputs and steer the model through feedback until results are trustworthy, which we call strong verification. These signals differ sharply in cost and reliability: strong verification can establish trust but is resource-intensive, while weak verification is fast and scalable but noisy and imperfect. We formalize this tension through weak--strong verification policies, which decide when to accept or reject based on weak verification and when to defer to strong verification. We introduce metrics capturing incorrect acceptance, incorrect rejection, and strong-verification frequency. Over population, we show that optimal policies admit a two-threshold structure and that calibration and sharpness govern the value of weak verifiers. Building on this, we develop an online algorithm that provably controls acceptance and rejection errors without assumptions on the query stream, the language model, or the weak verifier.

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Scalable Oversight & Alignment Theory

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

When to Trust the Cheap Check: Weak and Strong Verification for Reasoning

Related Papers