Feb 24, 2026arXiv:2602.21379

MrBERT: Modern Multilingual Encoders via Vocabulary, Domain, and Dimensional Adaptation

Daniel Tamayo, Iñaki Lacunza, Paula Rivera-Hidalgo, Severino Da Dalt, Javier Aula-Blasco, Aitor Gonzalez-Agirre, Marta Villegas

AI Summary

The paper introduces MrBERT, a family of ModernBERT-based multilingual encoders pre-trained on 35 languages and code. MrBERT achieves state-of-the-art results on Catalan and Spanish tasks and demonstrates strong performance in biomedical and legal domains through targeted vocabulary, domain, and dimensional adaptation. The incorporation of Matryoshka Representation Learning (MRL) allows for flexible vector sizing, reducing inference and storage costs.

Key Contribution

Domain- and language-specific adaptation of ModernBERT, combined with Matryoshka Representation Learning, delivers SOTA results and efficient inference, proving that smaller, targeted models can outperform larger general-purpose ones.

Abstract

We introduce MrBERT, a family of 150M-300M parameter encoders built on the ModernBERT architecture and pre-trained on 35 languages and code. Through targeted adaptation, this model family achieves state-of-the-art results on Catalan- and Spanish-specific tasks, while establishing robust performance across specialized biomedical and legal domains. To bridge the gap between research and production, we incorporate Matryoshka Representation Learning (MRL), enabling flexible vector sizing that significantly reduces inference and storage costs. Ultimately, the MrBERT family demonstrates that modern encoder architectures can be optimized for both localized linguistic excellence and efficient, high-stakes domain specialization. We open source the complete model family on Huggingface.

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Open-Source Models & Weights

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

MrBERT: Modern Multilingual Encoders via Vocabulary, Domain, and Dimensional Adaptation

Related Papers