Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University
1
0
2
LLM serving can be sped up by 50% on average by dynamically adapting model deployments to match the changing mix of request types.