BaiduCASJun 17, 2026arXiv:2606.19037

Querit-Reranker: Training Compact Multilingual Rerankers via Efficient Label-Free Distribution Adaptation

Yunfei Zhong, Jun Yang, Wei Huang, Yinqiong Cai, Haosheng Qian, Yixing Fan, Ruqing Zhang, Lixin Su, Daiting Shi, Jiafeng Guo

AI Summary

The paper introduces Querit-Reranker, a family of multilingual cross-encoder rerankers designed for efficient label-free adaptation to new target distributions. By leveraging a data-centric pipeline that incorporates synthetic-query mining with teacher scores as continuous soft labels, the models achieve significant improvements in ranking performance across multiple benchmarks. Notably, Querit-Reranker-A0.4B enhances nDCG@10 scores on BEIR and MIRACL, while Querit-Reranker-4B sets a new state-of-the-art in multilingual reranking tasks.

Key Contribution

Achieving state-of-the-art multilingual reranking without the burden of extensive task-specific annotations could revolutionize how we deploy AI across diverse languages and domains.

Abstract

Deployable multilingual rerankers must generalize across languages, domains, and target ranking tasks while remaining efficient enough for second-stage reranking. However, adapting them to new target distributions typically requires extensive task-specific relevance annotations, which are costly to obtain. We present Querit-Reranker, a family of multilingual cross-encoder rerankers trained with a data-centric pipeline for label-efficient adaptation. We instantiate it as Querit-Reranker-A0.4B, initialized from an in-house MoE backbone with 0.4B activated parameters, and Querit-Reranker-4B, initialized from Qwen3-Embedding-4B. Our pipeline first learns general relevance modeling from large-scale ranking-oriented data, then adapts to target distributions through synthetic-query mining with teacher scores as continuous soft labels. To consolidate complementary task-adapted strengths, we further merge checkpoints via spherical linear interpolation, obtaining a single deployable model without runtime ensembling overhead. Using Qwen3-Embedding-0.6B as the shared first-stage retriever, Querit-Reranker-A0.4B improves average nDCG@10 from 54.11 to 59.28 on BEIR and from 59.87 to 67.70 on MIRACL. On MTEB Multilingual v2 Reranking, it also substantially outperforms larger embedding-based baselines, while Querit-Reranker-4B further achieves state-of-the-art performance among publicly available models. We release both models on Hugging Face.

Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Querit-Reranker: Training Compact Multilingual Rerankers via Efficient Label-Free Distribution Adaptation

Related Papers