KyotoMay 27, 2026arXiv:2605.28305

Revisiting Anthropomorphic Reflection Markers in Large Language Model Reasoning

AI Summary

This paper investigates the role of anthropomorphic reflection markers (e.g., "wait," "hmm") in LLM reasoning by suppressing these markers at both the prompt and token levels. The study evaluates the impact of this suppression on performance across four benchmarks and two model scales. The key finding is that these markers are not uniformly necessary for reasoning and their suppression can even improve performance, suggesting they are surface cues rather than reliable indicators of genuine reflection.

Key Contribution

LLMs don't need "wait, let me think..." to reason—in fact, dropping the cutesy anthropomorphic markers can actually *improve* their performance.

Abstract

Large Language Models (LLMs) often produce explicit reflective traces during complex reasoning, accompanied by anthropomorphic markers such as wait, hmm, and alternatively. Although these markers are commonly used as visible indicators of reflection, their mechanisms remain unclear, which leaves the risk of overthinking associated with redundant and repetitive reflection markers. In this work, we revisit anthropomorphic reflection markers, examining their necessity for reasoning and role in the reflection. We suppress these markers through prompt-level and token-level interventions, and analyze their effects on task performance across four benchmarks and two model scales. Our results show that anthropomorphic markers are not uniformly necessary for reasoning performance: suppressing them can preserve or improve performance in several settings, especially under larger sampling budgets. Meanwhile, marker suppression does not necessarily remove reflection behavior, as models can still perform marker-free verification. These suggest that anthropomorphic markers tend to be surface cues rather than reliable proxies for reflection itself, and motivate future research on reasoning mechanisms beyond explicit marker patterns.

Interpretability & Mechanistic Interp Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Revisiting Anthropomorphic Reflection Markers in Large Language Model Reasoning

Related Papers