MilaBJTUApr 28, 2026arXiv:2604.25182

CroSearch-R1: Better Leveraging Cross-lingual Knowledge for Retrieval-Augmented Generation

Ruizhen Qi, Fengran Mo, Sijin Lu, Yufeng Chen, Jian-Yun Nie, Kaiyu Huang

AI Summary

This paper introduces CroSearch-R1, a search-augmented reinforcement learning framework designed to enhance Retrieval-Augmented Generation (RAG) by effectively integrating multilingual knowledge. By employing a multi-turn retrieval strategy and a multilingual rollout mechanism, the framework aligns cross-lingual knowledge into a unified representation space, addressing the limitations of traditional concatenation methods. Experimental results show that CroSearch-R1 significantly improves RAG effectiveness by leveraging the complementarity of multilingual data, outperforming existing approaches.

Key Contribution

CroSearch-R1 reveals that integrating cross-lingual knowledge through a dynamic retrieval strategy can substantially enhance the performance of Retrieval-Augmented Generation systems.

Abstract

A multilingual collection may contain useful knowledge in other languages to supplement and correct the facts in the original language for Retrieval-Augmented Generation (RAG). However, the vanilla approach that simply concatenates multiple pieces of knowledge from different languages into the context may fail to improve effectiveness due to the potential disparities across languages. To better leverage multilingual knowledge, we propose CroSearch-R1, a search-augmented reinforcement learning framework to integrate multilingual knowledge into the Group Relative Policy Optimization (GRPO) process. In particular, the approach adopts a multi-turn retrieval strategy with cross-lingual knowledge integration to dynamically align the knowledge from other languages as supplementary evidence into a unified representation space. Furthermore, we introduce a multilingual rollout mechanism to optimize reasoning transferability across languages. Experimental results demonstrate that our framework effectively leverages cross-lingual complementarity and improves the effectiveness of RAG with multilingual collections.

Natural Language Processing Recommendation & Information Retrieval RLHF & Preference Learning

Citation Metrics

Citations0

Influential citations0

References44

Year2026

VenueN/A

Related Papers

Finding related papers...