Search papers, labs, and topics across Lattice.
The paper introduces Cross-Context Review (CCR), a method for improving LLM output quality by having the review process occur in a separate session without access to the original production context. Through a controlled experiment with injected errors across various artifact types, CCR significantly outperformed same-session self-review, repeated self-review, and context-aware subagent review in error detection (F1 of 28.6%). The results demonstrate that the benefit of CCR stems from the context separation itself, rather than simply repeating the review process.
LLMs catch 17% more errors when reviewing their own work in a fresh session, proving that a change of scenery beats iterative self-critique.
Large language models struggle to catch errors in their own outputs when the review happens in the same session that produced them. This paper introduces Cross-Context Review (CCR), a straightforward method where the review is conducted in a fresh session with no access to the production conversation history. We ran a controlled experiment: 30 artifacts (code, technical documents, presentation scripts) with 150 injected errors, tested under four review conditions -- same-session Self-Review (SR), repeated Self-Review (SR2), context-aware Subagent Review (SA), and Cross-Context Review (CCR). Over 360 reviews, CCR reached an F1 of 28.6%, outperforming SR (24.6%, p=0.008, d=0.52), SR2 (21.7%, p<0.001, d=0.72), and SA (23.8%, p=0.004, d=0.57). The SR2 result matters most for interpretation: reviewing twice in the same session did not beat reviewing once (p=0.11), which rules out repetition as an explanation for CCR's advantage. The benefit comes from context separation itself. CCR works with any model, needs no infrastructure, and costs only one extra session.