May 6, 2026arXiv:2605.04899

A geometric relation of the error introduced by sampling a language model's output distribution to its internal state

AI Summary

The paper investigates the sensitivity of language model outputs to single-token changes by analyzing the geometry of token embeddings. They derive an $\mathfrak{so}(n)$-valued 1-form based on this geometry and demonstrate that its curvature correlates with the model's internal world model on chess reasoning tasks. Specifically, the curvature clusters by board region and reflects piece importance, suggesting a direct link between token space geometry and internal representations.

Key Contribution

Token embedding geometry isn't just abstract math—it directly mirrors how language models internally represent and reason about the world, as shown by its alignment with board state and piece importance in chess.

Abstract

GPT-style language models are sensitive to single-token changes at generation points where the predicted probability distribution is spread across multiple tokens. Viewing this sensitivity as a geometric property, we derive an $\mathfrak{so}(n)$-valued 1-form that depends only on the geometry of the token embeddings. Despite this purely geometric origin, we show that its curvature is semantically meaningful: On chess reasoning tasks, the curvature couples to the world model of an off-the-shelf instruction-tuned model, with transformations clustering by board region and respecting piece importance. Our findings suggest that token space geometry directly reflects how models internally represent problems.

Architecture Design (Transformers, SSMs, MoE)Interpretability & Mechanistic Interp World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

A geometric relation of the error introduced by sampling a language model's output distribution to its internal state

Related Papers