Search papers, labs, and topics across Lattice.
Microsoft
1
0
3
7
Language specialization in multilingual MoEs happens mostly in the final layers, suggesting a surprisingly simple recipe for parameter-efficient adaptation.