Search papers, labs, and topics across Lattice.
City University of Hong Kong, Hong Kong, China
1
0
3
Explicitly aligning MoE routing behavior during fine-tuning can significantly boost performance on multilingual tasks, especially when the model understands the task in English but struggles in the target language.