Mar 30, 2026arXiv:2603.28258

Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries

AI Summary

This paper investigates whether categorical perception (CP) effects, commonly observed in human perception, are present in the hidden state representations of LLMs processing Arabic numerals. Using representational similarity analysis across six LLMs, the authors find that model representations exhibit enhanced discriminability at digit-count boundaries (10 and 100), indicating CP-like warping. Interestingly, they identify two distinct CP signatures: "classic CP" where models explicitly categorize and show geometric warping, and "structural CP" where only geometric warping is present, suggesting that structural input discontinuities can induce CP independently of semantic knowledge.

Key Contribution

LLMs exhibit categorical perception-like warping in their hidden state representations at digit-count boundaries, even without explicit semantic category knowledge, revealing a surprising sensitivity to structural input discontinuities.

Abstract

Categorical perception (CP) -- enhanced discriminability at category boundaries -- is among the most studied phenomena in perceptual psychology. This paper reports that analogous geometric warping occurs in the hidden-state representations of large language models (LLMs) processing Arabic numerals. Using representational similarity analysis across six models from five architecture families, the study finds that a CP-additive model (log-distance plus a boundary boost) fits the representational geometry better than a purely continuous model at 100% of primary layers in every model tested. The effect is specific to structurally defined boundaries (digit-count transitions at 10 and 100), absent at non-boundary control positions, and absent in the temperature domain where linguistic categories (hot/cold) lack a tokenisation discontinuity. Two qualitatively distinct signatures emerge:"classic CP"(Gemma, Qwen), where models both categorise explicitly and show geometric warping, and"structural CP"(Llama, Mistral, Phi), where geometry warps at the boundary but models cannot report the category distinction. This dissociation is stable across boundaries and is a property of the architecture, not the stimulus. Structural input-format discontinuities are sufficient to produce categorical perception geometry in LLMs, independently of explicit semantic category knowledge.

Architecture Design (Transformers, SSMs, MoE)Interpretability & Mechanistic Interp Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries

Related Papers