Search papers, labs, and topics across Lattice.
Nanjing University
1
0
3
Forget GPU-centric All-Reduce: SCIN's switch-based architecture slashes latency by up to 8.7x and boosts LLaMA-2 performance by 34% through in-network quantization.