Search papers, labs, and topics across Lattice.
2
0
5
31
Smaller models can significantly enhance the training of larger models by providing structured exploration signals that improve performance without the noise of traditional methods.
LLMs can achieve 2.5x higher throughput and 10.7x KV memory reduction in long-context reasoning by compressing the KV cache using trigonometric functions derived from pre-RoPE query/key vector distributions.