Search papers, labs, and topics across Lattice.
School of Electrical and Computer Engineering, University of Sydney, Darlington, NSW, Australia, School of Electrical and Computer Engineering, The University of Sydney, Darlington, NSW, Australia
1
0
3
0
Stop leaving performance on the table: jointly optimizing resource allocation and request batching with reinforcement learning can yield up to 24x speedups for multi-tenant GPU inference.