Search papers, labs, and topics across Lattice.
The paper introduces a cross-linguistic projection approach to generate predicate-argument annotations in new languages, leveraging the QA-SRL framework. They reuse an English QA-SRL parser within a constrained translation and word-alignment pipeline to automatically generate question-answer annotations aligned with target-language predicates. Applied to Hebrew, Russian, and French, the resulting fine-tuned, language-specific parsers outperform strong multilingual LLM baselines like GPT-4o and LLaMA-Maverick, demonstrating the effectiveness of QA-SRL for cross-lingual semantic annotation.
Skip the costly annotation bottleneck: a clever translation and projection method lets you train high-quality semantic role labelers in new languages, even beating GPT-4o.
Explicit representations of predicate-argument relations form the basis of interpretable semantic analysis, supporting reasoning, generation, and evaluation. However, attaining such semantic structures requires costly annotation efforts and has remained largely confined to English. We leverage the Question-Answer driven Semantic Role Labeling (QA-SRL) framework -- a natural-language formulation of predicate-argument relations -- as the foundation for extending semantic annotation to new languages. To this end, we introduce a cross-linguistic projection approach that reuses an English QA-SRL parser within a constrained translation and word-alignment pipeline to automatically generate question-answer annotations aligned with target-language predicates. Applied to Hebrew, Russian, and French -- spanning diverse language families -- the method yields high-quality training data and fine-tuned, language-specific parsers that outperform strong multilingual LLM baselines (GPT-4o, LLaMA-Maverick). By leveraging QA-SRL as a transferable natural-language interface for semantics, our approach enables efficient and broadly accessible predicate-argument parsing across languages.