Search papers, labs, and topics across Lattice.
This study investigates the effectiveness of source language selection in cross-lingual in-context learning (ICL) across seven tasks and six models, revealing that traditional fine-tuning insights do not hold in this new paradigm. By analyzing language confusion as a significant barrier for generative tasks, the authors provide a comprehensive evaluation of how different languages impact ICL performance. The findings suggest that alternative heuristics for source language selection can enhance cross-lingual ICL outcomes, challenging existing assumptions in the field.
Conventional wisdom about language transfer in NLP fails in the context of few-shot in-context learning, revealing the need for new heuristics in source language selection.
Cross-lingual transfer in multilingual NLP has been widely explored in supervised fine-tuning contexts, where factors like data availability and linguistic similarity largely determine transfer quality. As the field shifts toward few-shot In-Context Learning (ICL), it is often presumed that insights from fine-tuning carry over unchanged. Yet this assumption has not been rigorously evaluated, leaving open the question of how to choose source languages for cross-lingual ICL. We conduct a broad empirical study of cross-lingual transfer in ICL spanning seven tasks, six models, and a typologically diverse set of languages. We further analyze language confusion, a key obstacle for generative tasks in cross-lingual ICL. Our results show that conventional fine-tuning-based expectations do not consistently apply in the ICL regime and point to alternative heuristics for selecting source languages effectively.