Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University
1
0
3
0
Hopper's idle NVLink Copy Engine can be turned into a nearly free communication channel for MoE load balancing, slashing token stragglers by up to 70% without impacting existing parallelism.