Search papers, labs, and topics across Lattice.
University of Southern California, Los Angeles, CA, USA
1
2
3
3
Achieve near-optimal DLRM inference speedups across diverse hardware (NVIDIA, AMD, TPU) with a single optimization pass, eliminating the need for vendor-specific tuning.