Search papers, labs, and topics across Lattice.
This paper investigates translationese, the systematic divergence of translations from natively produced text, by framing it as a rational response to the cognitive load of the translation task. The authors operationalize translationese using an automatic classifier's "translatedness" score and quantify translation task difficulty using LLM surprisal-based information-theoretic metrics, alongside syntactic and semantic features. Experiments on a bidirectional English-German corpus show that translationese can be partially predicted by translation task difficulty, with cross-lingual transfer difficulty being more influential than source-text complexity, and syntactic complexity/solution entropy being strong predictors.
Translationese, the tell-tale sign of translated text, can be predicted by quantifying the cognitive load of the translation task itself, revealing that it's not just stylistic preference but a rational adaptation to difficulty.
Translations systematically diverge from texts originally produced in the target language, a phenomenon widely referred to as translationese. Translationese has been attributed to production tendencies (e.g. interference, simplification), socio-cultural variables, and language-pair effects, yet a unified explanatory account is still lacking. We propose that translationese reflects cognitive load inherent in the translation task itself. We test whether observable translationese can be predicted from quantifiable measures of translation task difficulty. Translationese is operationalised as a segment-level translatedness score produced by an automatic classifier. Translation task difficulty is conceptualised as comprising source-text and cross-lingual transfer components, operationalised mainly through information-theoretic metrics based on LLM surprisal, complemented by established syntactic and semantic alternatives. We use a bidirectional English-German corpus comprising written and spoken subcorpora. Results indicate that translationese can be partly explained by translation task difficulty, especially in English-to-German. For most experiments, cross-lingual transfer difficulty contributes more than source-text complexity. Information-theoretic indicators match or outperform traditional features in written mode, but offer no advantage in spoken mode. Source-text syntactic complexity and translation-solution entropy emerged as the strongest predictors of translationese across language pairs and modes.