Search papers, labs, and topics across Lattice.
Peking University
1
0
3
2
Achieve 1.69x faster Mixture-of-Experts training by dynamically re-arranging expert parameters to balance load across devices.