Feb 25, 2026arXiv:2602.21814

Prompt Architecture Determines Reasoning Quality: A Variable Isolation Study on the Car Wash Problem

AI Summary

This paper investigates the impact of different prompt architecture layers on the performance of Claude 3.5 Sonnet on the "car wash problem," a benchmark requiring implicit physical constraint inference. Through a variable isolation study, the authors demonstrate that the STAR reasoning framework significantly improves accuracy from 0% to 85%. Further gains are achieved by incorporating user profile context via vector database retrieval and RAG context, ultimately reaching 100% accuracy, highlighting the importance of structured reasoning scaffolds over context injection.

Key Contribution

Forget fancy RAG pipelines: forcing LLMs to articulate the goal before reasoning about constraints is the real secret to solving the "car wash problem."

Abstract

Large language models consistently fail the"car wash problem,"a viral reasoning benchmark requiring implicit physical constraint inference. We present a variable isolation study (n=20 per condition, 6 conditions, 120 total trials) examining which prompt architecture layers in a production system enable correct reasoning. Using Claude 3.5 Sonnet with controlled hyperparameters (temperature 0.7, top_p 1.0), we find that the STAR (Situation-Task-Action-Result) reasoning framework alone raises accuracy from 0% to 85% (p=0.001, Fisher's exact test, odds ratio 13.22). Adding user profile context via vector database retrieval provides a further 10 percentage point gain, while RAG context contributes an additional 5 percentage points, achieving 100% accuracy in the full-stack condition. These results suggest that structured reasoning scaffolds -- specifically, forced goal articulation before inference -- matter substantially more than context injection for implicit constraint reasoning tasks.

Architecture Design (Transformers, SSMs, MoE)Eval Frameworks & Benchmarks Natural Language Processing Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References7

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Prompt Architecture Determines Reasoning Quality: A Variable Isolation Study on the Car Wash Problem

Related Papers