Search papers, labs, and topics across Lattice.
Bytedance Seed
1
0
3
10
Achieve 1.69x faster Mixture-of-Experts training by dynamically re-arranging expert parameters to balance load across devices.