Search papers, labs, and topics across Lattice.
1
0
2
3
Federated LLM inference gets a speed boost: SpecFed's speculative decoding and compressed communication slashes latency without sacrificing generation quality.