Search papers, labs, and topics across Lattice.
DisNet Lab, The University of Melbourne
1
1
2
0
Achieve near-instant (<50ms) service downtime when dynamically reconfiguring LLM inference pipelines across heterogeneous GPUs in serverless environments.