Search papers, labs, and topics across Lattice.
2
0
5
LongLive-RAG transforms long video generation by enabling the use of a searchable memory of past latents, drastically reducing error accumulation.
LLMs can achieve 2.5x higher throughput and 10.7x KV memory reduction in long-context reasoning by compressing the KV cache using trigonometric functions derived from pre-RoPE query/key vector distributions.