Search papers, labs, and topics across Lattice.
1
0
3
Achieve 75% input length reduction in LLMs with minimal performance loss by compressing token embeddings directly in the latent space.