Pompeu Fabra UniversityWikimedia FoundationMar 17, 2026arXiv:2603.17146

Multilingual Reference Need Assessment System for Wikipedia

A. Baigutanova, F. Navas, Pablo Aragón, Mykola Trokhymovych, Muniza Aslam, Ai-Jou Chou, Miriam Redi, Diego Sáez-Trumper

AI Summary

This paper introduces a multilingual machine learning system designed to assist Wikipedia editors in identifying claims that require citations across 10 different language editions. The system addresses the labor-intensive process of manual verification by editors, which is crucial for maintaining the reliability of Wikipedia content. The deployed system outperforms existing benchmarks for reference need assessment while also considering real-world infrastructure constraints and computational efficiency.

Key Contribution

Wikipedia editors can now get AI assistance to identify claims needing citations in 10 languages, improving content reliability at scale.

Abstract

Wikipedia is a critical source of information for millions of users across the Web. It serves as a key resource for large language models, search engines, question-answering systems, and other Web-based applications. In Wikipedia, content needs to be verifiable, meaning that readers can check that claims are backed by references to reliable sources. This depends on manual verification by editors, an effective but labor-intensive process, especially given the high volume of daily edits. To address this challenge, we introduce a multilingual machine learning system to assist editors in identifying claims requiring citations. Our approach is tested in 10 language editions of Wikipedia, outperforming existing benchmarks for reference need assessment. We not only consider machine learning evaluation metrics but also system requirements, allowing us to explore the trade-offs between model accuracy and computational efficiency under real-world infrastructure constraints. We deploy our system in production and release data and code to support further research.

Data Curation & Synthetic Data Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References37

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Multilingual Reference Need Assessment System for Wikipedia

Related Papers