Search papers, labs, and topics across Lattice.
University of Oslo
1
0
2
E2LLM cuts average waiting time by over 50% in high-demand scenarios by intelligently partitioning LLMs across heterogeneous devices.