Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University
1
0
3
17
Achieve 1.69x faster Mixture-of-Experts training by dynamically re-arranging expert parameters to balance load across devices.