Search papers, labs, and topics across Lattice.
This paper investigates cross-lingual transfer learning for euphemism detection between Turkish and English, categorizing euphemisms into Overlapping (OPETs) and Non-Overlapping (NOPETs) sets based on semantic alignment. The study demonstrates that semantic overlap alone is insufficient for effective transfer, especially from Turkish to English, and that transfer performance can sometimes improve when training on NOPETs. The authors attribute these counterintuitive results to differences in label distributions between the languages.
Semantic similarity isn't enough for cross-lingual euphemism transfer: training on dissimilar euphemisms in another language can paradoxically *improve* performance.
Euphemisms substitute socially sensitive expressions, often softening or reframing meaning, and their reliance on cultural and pragmatic context complicates modeling across languages. In this study, we investigate how cross-lingual equivalence influences transfer in multilingual euphemism detection. We categorize Potentially Euphemistic Terms (PETs) in Turkish and English into Overlapping (OPETs) and Non-Overlapping (NOPETs) subsets based on their functional, pragmatic, and semantic alignment. Our findings reveal a transfer asymmetry: semantic overlap is insufficient to guarantee positive transfer, particularly in low-resource Turkish-to-English direction, where performance can degrade even for overlapping euphemisms, and in some cases, improve under NOPET-based training. Differences in label distribution help explain these counterintuitive results. Category-level analysis suggests that transfer may be influenced by domain-specific alignment, though evidence is limited by sparsity.