Mar 18, 2026arXiv:2603.17962

ConGA: Guidelines for Contextual Gender Annotation. A Framework for Annotating Gender in Machine Translation

Argentina Anna Rescigno, Argentina Anna Rescigno, Eva Vanmassenhove, Eva Vanmassenhove, Johanna Monti, Johanna Monti

AI Summary

The paper introduces ConGA, a linguistically-grounded annotation framework for word-level gender in English-to-Italian machine translation, distinguishing semantic gender in English and grammatical gender realization in Italian. Applying ConGA to the gENder-IT dataset, the authors created a gold-standard resource to evaluate gender bias in translation systems. Evaluation using this resource reveals systematic masculine overuse and inconsistent feminine realization in current MT systems.

Key Contribution

Current machine translation systems exhibit systematic masculine overuse and inconsistent feminine realization when translating from gender-neutral languages, a problem that can now be quantified thanks to a new gold-standard annotation framework.

Abstract

Handling gender across languages remains a persistent challenge for Machine Translation (MT) and Large Language Models (LLMs), especially when translating from gender-neutral languages into morphologically gendered ones, such as English to Italian. English largely omits grammatical gender, while Italian requires explicit agreement across multiple grammatical categories. This asymmetry often leads MT systems to default to masculine forms, reinforcing bias and reducing translation accuracy. To address this issue, we present the Contextual Gender Annotation (ConGA) framework, a linguistically grounded set of guidelines for word-level gender annotation. The scheme distinguishes between semantic gender in English through three tags, Masculine (M), Feminine (F), and Ambiguous (A), and grammatical gender realisation in Italian (Masculine (M), Feminine (F)), combined with entity-level identifiers for cross-sentence tracking. We apply ConGA to the gENder-IT dataset, creating a gold-standard resource for evaluating gender bias in translation. Our results reveal systematic masculine overuse and inconsistent feminine realisation, highlighting persistent limitations of current MT systems. By combining fine-grained linguistic annotation with quantitative evaluation, this work offers both a methodology and a benchmark for building more gender-aware and multilingual NLP systems.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References30

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

ConGA: Guidelines for Contextual Gender Annotation. A Framework for Annotating Gender in Machine Translation

Related Papers