NVIDIAIIT DelhiApr 14, 2026arXiv:2604.13275

Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size

Dikshant Kukreja, Kshitij Sah, Gautam Gupta, Avinash Anand, Rajiv Ratn Shah, Zhengkui Wang, Aik Beng Ng, Erik Cambria Iiit Delhi, India, Nanyang Technological University, Nvidia, Singapore University of Technology

AI Summary

The authors investigate how language model scale affects contextual entrainment, the tendency to favor tokens appearing in the context. They find that larger models become better at ignoring false claims (semantic contexts) but worse at ignoring irrelevant tokens (non-semantic contexts). Through scaling laws analysis on Cerebras-GPT and Pythia model families, they show that semantic and non-semantic entrainment exhibit opposite power-law scaling trends.

Key Contribution

Scaling up LLMs doesn't uniformly improve context handling; instead, it paradoxically amplifies the tendency to copy irrelevant tokens while simultaneously improving resistance to misinformation.

Abstract

Larger language models become simultaneously better and worse at handling contextual information -- better at ignoring false claims, worse at ignoring irrelevant tokens. We formalize this apparent paradox through the first scaling laws for contextual entrainment, the tendency of models to favor tokens that appeared in context regardless of relevance. Analyzing the Cerebras-GPT (111M-13B) and Pythia (410M-12B) model families, we find entrainment follows predictable power-law scaling, but with opposite trends depending on context type: semantic contexts show decreasing entrainment with scale, while non-semantic contexts show increasing entrainment. Concretely, the largest models are four times more resistant to counterfactual misinformation than the smallest, yet simultaneously twice as prone to copying arbitrary tokens. These diverging trends, which replicate across model families, suggest that semantic filtering and mechanical copying are functionally distinct behaviors that scale in opposition -- scaling alone does not resolve context sensitivity, it reshapes it.

Eval Frameworks & Benchmarks Open-Source Models & Weights Scaling Laws & Emergent Abilities

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size

Related Papers