Search papers, labs, and topics across Lattice.
Oregon State University
1
0
2
Achieve up to 4.79x higher throughput in LLM serving by dynamically switching between data and tensor parallelism on the fly, without restarting workers.