Search papers, labs, and topics across Lattice.
2
0
4
Training LLMs on ultra-long contexts just got a whole lot easier: AutoSP automates sequence parallelism and activation checkpointing, boosting context length by up to 2.7x with negligible throughput cost.
Achieve 14x attention speedups and 60% end-to-end latency reduction in long-context LLMs without sacrificing quality by reusing prior attention computations.