Search papers, labs, and topics across Lattice.
Oak Ridge National Laboratory
1
0
2
6
Achieve up to 4.79x higher throughput in LLM serving by dynamically switching between data and tensor parallelism on the fly, without restarting workers.