LinköpingMar 9, 2026arXiv:2603.08450

A Dataset for Probing Translationese Preferences in English-to-Swedish Translation

Jenny Kunz, Anja Jarochenko, Marcel Bollmann

AI Summary

This paper introduces a new English-to-Swedish dataset designed to evaluate translationese in language models by contrasting translated sentences with more idiomatic alternatives. Experiments using the dataset reveal that smaller Swedish and multilingual LLMs often prefer translationese phrasings, even without the English source context. The study demonstrates that exposure to the source language biases models towards literal translations, highlighting a need for models that produce more natural output in non-English languages.

Key Contribution

LLMs often prefer awkward, literal translations over natural-sounding alternatives, even when the original source text is removed.

Abstract

Translations often carry traces of the source language, a phenomenon known as translationese. We introduce the first freely available English-to-Swedish dataset contrasting translationese sentences with idiomatic alternatives, designed to probe intrinsic preferences of language models. It includes error tags and descriptions of the problems in the original translations. In experiments evaluating smaller Swedish and multilingual LLMs with our dataset, we find that they often favor the translationese phrasing. Human alternatives are chosen more often when the English source sentence is omitted, indicating that exposure to the source biases models toward literal translations, although even without context models often prefer the translationese variant. Our dataset and findings provide a resource and benchmark for developing models that produce more natural, idiomatic output in non-English languages.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References23

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

A Dataset for Probing Translationese Preferences in English-to-Swedish Translation

Related Papers