MasakhaneSaarland UniversityUniversity of DeustoWollo UniversityApr 1, 2026arXiv:2604.00706

AfrIFact: Cultural Information Retrieval, Evidence Extraction and Fact Checking for African Languages

Israel Abebe Azime, Jesujoba Oluwadara Alabi, Crystina Zhang, Iffat Maab, A. Tonja, Tadesse Destaw Belay, Folasade Peace Alabi, Salomey Osei, Saminu Mohammad Aliyu, Nkechinyere Faith Aguobi, Bontu Fufa Balcha, Blessing K. Sibanda, Davis David, Mouhamadane Mboup, D. Abolade, Neo Putini, Philipp Slusallek, D. Adelani, Dietrich Klakow

AI Summary

The paper introduces AfrIFact, a new dataset for automatic fact-checking in ten African languages and English, covering information retrieval, evidence extraction, and fact-checking. Experiments reveal limitations in cross-lingual retrieval capabilities of embedding models and highlight the difficulty of retrieving healthcare-domain documents compared to cultural or news documents. The study demonstrates that while LLMs struggle with multilingual fact verification in African languages, few-shot prompting and task-specific fine-tuning significantly improve performance, particularly with the AfriqueQwen-14B model.

Key Contribution

LLMs' fact-checking abilities in African languages are surprisingly weak, but can be boosted by up to 43% with few-shot prompting and 26% with fine-tuning.

Abstract

Assessing the veracity of a claim made online is a complex and important task with real-world implications. When these claims are directed at communities with limited access to information and the content concerns issues such as healthcare and culture, the consequences intensify, especially in low-resource languages. In this work, we introduce AfrIFact, a dataset that covers the necessary steps for automatic fact-checking (i.e., information retrieval, evidence extraction, and fact checking), in ten African languages and English. Our evaluation results show that even the best embedding models lack cross-lingual retrieval capabilities, and that cultural and news documents are easier to retrieve than healthcare-domain documents, both in large corpora and in single documents. We show that LLMs lack robust multilingual fact-verification capabilities in African languages, while few-shot prompting improves performance by up to 43% in AfriqueQwen-14B, and task-specific fine-tuning further improves fact-checking accuracy by up to 26%. These findings, along with our release of the AfrIFact dataset, encourage work on low-resource information retrieval, evidence retrieval, and fact checking.

Data Curation & Synthetic Data Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

AfrIFact: Cultural Information Retrieval, Evidence Extraction and Fact Checking for African Languages

Related Papers