UWAlpinferenceBen-Gurion University of the NegevApr 14, 2026arXiv:2604.12744

Universal NER v2: Towards a Massively Multilingual Named Entity Recognition Benchmark

Terra Blevins, Stephen Mayhew, Marek vSuppa, Hila Gonen, Shachar Mirkin, Vasile Păiș, Kaja Dobrovoljc, Voula Giouli, J. Kevin, Eugene Jang, Eung-hwa Kim, Jeongyeon Seo, Xenophon Gialis, Yuval Pinter

AI Summary

The Universal NER v2 project expands upon the original UNER v1 dataset to create a massively multilingual Named Entity Recognition benchmark, addressing the scarcity of gold-standard evaluation resources for non-English languages. This benchmark employs a general tagset and annotation guidelines to ensure standardized, cross-lingual annotations of named entity spans. The project aims to facilitate more robust evaluation of multilingual language models and promote research in low-resource languages.

Key Contribution

Massively multilingual NER just got easier: UNER v2 offers a standardized benchmark for evaluating LLMs across diverse languages.

Abstract

While multilingual language models promise to bring the benefits of LLMs to speakers of many languages, gold-standard evaluation benchmarks in most languages to interrogate these assumptions remain scarce. The Universal NER project, now entering its fourth year, is dedicated to building gold-standard multilingual Named Entity Recognition (NER) benchmark datasets. Inspired by existing massively multilingual efforts for other core NLP tasks (e.g., Universal Dependencies), the project uses a general tagset and thorough annotation guidelines to collect standardized, cross-lingual annotations of named entity spans. The first installment (UNER v1) was released in 2024, and the project has continued and expanded since then, with various organizers, annotators, and collaborators in an active community.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References38

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Universal NER v2: Towards a Massively Multilingual Named Entity Recognition Benchmark

Related Papers