Search papers, labs, and topics across Lattice.
This paper investigates what aspects of chain-of-thought (CoT) prompting drive performance improvements at probe time, focusing on a fixed rationale in context. The study finds that while lexical activation contributes significantly, the primary driver of CoT's effectiveness is local co-occurrence of tokens within short windows (2-3 tokens), rather than global logical ordering or grammatical structure. These findings are consistent across model families, scales, and datasets, supporting a local co-occurrence activation account of CoT.
Chain-of-thought prompting works not because of deep reasoning, but because adjacent tokens nudge the model towards the right answer.
Chain-of-thought (CoT) prompting reliably improves language-model accuracy, but which properties of a rationale text drive the improvement is poorly understood. Prior work has largely studied generation-time behavior. We instead ask a probe-time question: given a fixed rationale in context, what in that text changes the answer? We identify two complementary sources of the gain. First, even a globally word-shuffled rationale substantially outperforms the no-rationale baseline, indicating a strong lexical activation effect. More importantly, the additional gain from structured text appears to arise less from sentence-level logical ordering and more from short-range token adjacency. Preserving contiguous windows of just $n^\star{=}2$--$3$ tokens recovers most of the remaining gain toward full CoT performance. Supporting experiments rule out copying of explicit answer declarations or answer values, as well as full grammatical realization, as primary drivers. Further generalization experiments show that the qualitative pattern remains stable across multiple model families, parameter scales, and datasets. These results support a local co-occurrence activation (LCA) account of probe-time CoT, in which the observed gains appear to arise primarily from lexical activation and short-range token co-occurrence rather than sentence-level logical derivation.