Search papers, labs, and topics across Lattice.
This paper investigates whether function vectors (FVs), extracted from multilingual LLMs during in-context machine translation, exhibit language-agnostic properties. The authors find that translation FVs derived from English to a specific target language can be successfully transferred to other target languages, improving translation performance across multiple unseen languages. Ablation studies confirm the importance of the FV for translation, and the transferability extends to instruction-tuned models and partially generalizes from word-level to sentence-level translation.
English-to-X translation skills can be distilled into function vectors that generalize to Y, Z, and other languages, suggesting a shared underlying translation mechanism in multilingual LLMs.
Function vectors (FVs) are vector representations of tasks extracted from model activations during in-context learning. While prior work has shown that multilingual model representations can be language-agnostic, it remains unclear whether the same holds for function vectors. We study whether FVs exhibit language-agnosticity, using machine translation as a case study. Across three decoder-only multilingual LLMs, we find that translation FVs extracted from a single English$\rightarrow$Target direction transfer to other target languages, consistently improving the rank of correct translation tokens across multiple unseen languages. Ablation results show that removing the FV degrades translation across languages with limited impact on unrelated tasks. We further show that base-model FVs transfer to instruction-tuned variants and partially generalize from word-level to sentence-level translation.