Bar-IlanFeb 26, 2026arXiv:2602.22865

Effective QA-driven Annotation of Predicate-Argument Relations Across Languages

Jonathan Davidov, Jonathan Davidov, Aviv Slobodkin, Aviv Slobodkin, Shmuel Tomi Klein, S. Klein, Reut Tsarfaty, Reut Tsarfaty, Ido Dagan, Ido Dagan, Ayal Klein, Ayal Klein

AI Summary

The paper introduces a cross-linguistic projection approach to generate predicate-argument annotations in new languages, leveraging the QA-SRL framework. They reuse an English QA-SRL parser within a constrained translation and word-alignment pipeline to automatically generate question-answer annotations aligned with target-language predicates. Applied to Hebrew, Russian, and French, the resulting fine-tuned, language-specific parsers outperform strong multilingual LLM baselines like GPT-4o and LLaMA-Maverick, demonstrating the effectiveness of QA-SRL for cross-lingual semantic annotation.

Key Contribution

Skip the costly annotation bottleneck: a clever translation and projection method lets you train high-quality semantic role labelers in new languages, even beating GPT-4o.

Abstract

Explicit representations of predicate-argument relations form the basis of interpretable semantic analysis, supporting reasoning, generation, and evaluation. However, attaining such semantic structures requires costly annotation efforts and has remained largely confined to English. We leverage the Question-Answer driven Semantic Role Labeling (QA-SRL) framework -- a natural-language formulation of predicate-argument relations -- as the foundation for extending semantic annotation to new languages. To this end, we introduce a cross-linguistic projection approach that reuses an English QA-SRL parser within a constrained translation and word-alignment pipeline to automatically generate question-answer annotations aligned with target-language predicates. Applied to Hebrew, Russian, and French -- spanning diverse language families -- the method yields high-quality training data and fine-tuned, language-specific parsers that outperform strong multilingual LLM baselines (GPT-4o, LLaMA-Maverick). By leveraging QA-SRL as a transferable natural-language interface for semantics, our approach enables efficient and broadly accessible predicate-argument parsing across languages.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References46

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Effective QA-driven Annotation of Predicate-Argument Relations Across Languages

Related Papers