Apr 21, 2026arXiv:2604.19139

The Rise of Verbal Tics in Large Language Models: A Systematic Analysis Across Frontier Models

Shuai Wu, Yanna Feng, Yufang Li, Zhijun Wang, Ran Wang

AI Summary

This paper systematically analyzes the prevalence of verbal tics—repetitive linguistic patterns—in eight state-of-the-art LLMs using a novel Verbal Tic Index (VTI) across 10,000 prompts in English and Chinese. The study reveals significant inter-model variation in VTI scores, with Gemini 3.1 Pro exhibiting the highest and DeepSeek V3.2 the lowest, and demonstrates that tics accumulate in multi-turn conversations and are amplified in subjective tasks. Human evaluation confirms a strong inverse correlation between sycophancy and perceived naturalness, highlighting the "alignment tax" of current training paradigms.

Key Contribution

LLMs are drowning in verbal tics—sycophantic openers and pseudo-empathetic affirmations—and this "alignment tax" significantly reduces perceived naturalness.

Abstract

As Large Language Models (LLMs) continue to evolve through alignment techniques such as Reinforcement Learning from Human Feedback (RLHF) and Constitutional AI, a growing and increasingly conspicuous phenomenon has emerged: the proliferation of verbal tics -- repetitive, formulaic linguistic patterns that pervade model outputs. These range from sycophantic openers ("That's a great question!","Awesome!") to pseudo-empathetic affirmations ("I completely understand your concern","I'm right here to catch you") and overused vocabulary ("delve","tapestry","nuanced"). In this paper, we present a systematic analysis of the verbal tic phenomenon across eight state-of-the-art LLMs: GPT-5.4, Claude Opus 4.7, Gemini 3.1 Pro, Grok 4.2, Doubao-Seed-2.0-pro, Kimi K2.5, DeepSeek V3.2, and MiMo-V2-Pro. Utilizing a custom evaluation framework for standardized API-based evaluation, we assess 10,000 prompts across 10 task categories in both English and Chinese, yielding 160,000 model responses. We introduce the Verbal Tic Index (VTI), a composite metric quantifying tic prevalence, and analyze its correlation with sycophancy, lexical diversity, and human-perceived naturalness. Our findings reveal significant inter-model variation: Gemini 3.1 Pro exhibits the highest VTI (0.590), while DeepSeek V3.2 achieves the lowest (0.295). We further demonstrate that verbal tics accumulate over multi-turn conversations, are amplified in subjective tasks, and show distinct cross-lingual patterns. Human evaluation (N = 120) confirms a strong inverse relationship between sycophancy and perceived naturalness (r = -0.87, p<0.001). These results underscore the"alignment tax"of current training paradigms and highlight the urgent need for more authentic human-AI interaction frameworks.

Constitutional AI & AI Ethics Natural Language Processing RLHF & Preference Learning

Citation Metrics

Citations0

Influential citations0

References16

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

The Rise of Verbal Tics in Large Language Models: A Systematic Analysis Across Frontier Models

Related Papers