Search papers, labs, and topics across Lattice.
This paper introduces CroSearch-R1, a search-augmented reinforcement learning framework designed to enhance Retrieval-Augmented Generation (RAG) by effectively integrating multilingual knowledge. By employing a multi-turn retrieval strategy and a multilingual rollout mechanism, the framework aligns cross-lingual knowledge into a unified representation space, addressing the limitations of traditional concatenation methods. Experimental results show that CroSearch-R1 significantly improves RAG effectiveness by leveraging the complementarity of multilingual data, outperforming existing approaches.
CroSearch-R1 reveals that integrating cross-lingual knowledge through a dynamic retrieval strategy can substantially enhance the performance of Retrieval-Augmented Generation systems.
A multilingual collection may contain useful knowledge in other languages to supplement and correct the facts in the original language for Retrieval-Augmented Generation (RAG). However, the vanilla approach that simply concatenates multiple pieces of knowledge from different languages into the context may fail to improve effectiveness due to the potential disparities across languages. To better leverage multilingual knowledge, we propose CroSearch-R1, a search-augmented reinforcement learning framework to integrate multilingual knowledge into the Group Relative Policy Optimization (GRPO) process. In particular, the approach adopts a multi-turn retrieval strategy with cross-lingual knowledge integration to dynamically align the knowledge from other languages as supplementary evidence into a unified representation space. Furthermore, we introduce a multilingual rollout mechanism to optimize reasoning transferability across languages. Experimental results demonstrate that our framework effectively leverages cross-lingual complementarity and improves the effectiveness of RAG with multilingual collections.