Search papers, labs, and topics across Lattice.
The University of Hong Kong
1
0
3
Explicitly aligning MoE routing behavior during fine-tuning can significantly boost performance on multilingual tasks, especially when the model understands the task in English but struggles in the target language.