University of the Basque Country UPV/EHUApr 22, 2026arXiv:2604.20531

Effects of Cross-lingual Evidence in Multilingual Medical Question Answering

Anar Yeginbergen, Maite Oronoz, Rodrigo Agerri

AI Summary

This paper benchmarks multilingual medical question answering (QA) across high- and low-resource languages, evaluating the impact of curated medical knowledge, web-retrieved content, and LLM-generated explanations as external evidence sources. They find that while larger models perform better in English baselines, web-retrieved English data is most helpful for high-resource languages, while a combination of English and target-language retrieval is optimal for low-resource languages. The study also highlights the limitations of specialized medical knowledge sources like PubMed due to insufficient multilingual coverage, challenging the universal benefit of external knowledge in QA.

Key Contribution

Turns out, the best external knowledge source for multilingual medical QA depends on whether you're working with a high- or low-resource language, and blindly adding PubMed might not be the answer.

Abstract

This paper investigates Multilingual Medical Question Answering across high-resource (English, Spanish, French, Italian) and low-resource (Basque, Kazakh) languages. We evaluate three types of external evidence sources across models of varying size: curated repositories of specialized medical knowledge, web-retrieved content, and explanations from LLM's parametric knowledge. Moreover, we conduct experiments with multilingual, monolingual and cross-lingual retrieval. Our results demonstrate that larger models consistently achieve superior performance in English across baseline evaluations. When incorporating external knowledge, web-retrieved data in English proves most beneficial for high-resource languages. Conversely, for low-resource languages, the most effective strategy combines retrieval in both English and the target language, achieving comparable accuracy to high-resource language results. These findings challenge the assumption that external knowledge systematically improves performance and reveal that effective strategies depend on both the source of language resources and on model scale. Furthermore, specialized medical knowledge sources such as PubMed are limited: while they provide authoritative expert knowledge, they lack adequate multilingual coverage

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Effects of Cross-lingual Evidence in Multilingual Medical Question Answering

Related Papers