Search papers, labs, and topics across Lattice.
2
1
4
28
LLM training bottlenecks? ZipCCL achieves up to 1.18x end-to-end speedups by losslessly compressing communication collectives, without sacrificing model quality.
Naive application of LLM inference optimizations can *hurt* the performance of smaller reasoning models, highlighting the need for RLLM-specific serving strategies.