Search papers, labs, and topics across Lattice.
Peking University, PKU
1
0
3
6
Achieve 1.69x faster Mixture-of-Experts training by dynamically re-arranging expert parameters to balance load across devices.