USCMar 8, 2026arXiv:2603.07534

Accent Vector: Controllable Accent Manipulation for Multilingual TTS Without Accented Data

Thanathai Lertpetchpun, Thanapat Trachu, Jihwan Lee, Tiantian Feng, Dani Byrd, Shrikanth S. Narayanan

AI Summary

The paper introduces "Accent Vector," a method for controllable accent manipulation in multilingual TTS systems without requiring accented training data. This is achieved by fine-tuning a TTS model on native speech of a different language and deriving task vectors that capture accent characteristics in English. Scaling and interpolating the Accent Vector allows for fine-grained control over accent strength and the generation of mixed-accent speech, even generalizing to other languages.

Key Contribution

Control the accent of your TTS system in multiple languages without ever training on accented data.

Abstract

Accent is an integral part of society, reflecting multiculturalism and shaping how individuals express identity. The majority of English speakers are non-native (L2) speakers, yet current Text-To-Speech (TTS) systems primarily model American-accented English due limited accented data. We propose \textit{Accent Vector}, a controllable representation that enables accent manipulation in multilingual TTS without requiring accented training data. \textit{Accent Vector} is derived by fine-tuning a TTS system on native speech of a different language (i.e. non-English) and computing task vectors capturing accent characteristics (i.e. in English). By scaling and interpolating the vector, we achieve fine-grained control over accent strength and generate mixed-accent speech. In addition, it generalizes beyond English, enabling accent control across multiple languages. Objective and human evaluations confirm the effectiveness of Accent Vector for fine-grained and compositional accent control.

Natural Language Processing Speech & Audio

Citation Metrics

Citations0

Influential citations0

References45

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Accent Vector: Controllable Accent Manipulation for Multilingual TTS Without Accented Data

Related Papers