Search papers, labs, and topics across Lattice.
University of Illinois at Urbana-Champaign
1
0
3
7
Achieve 75% input length reduction in LLMs with minimal performance loss by compressing token embeddings directly in the latent space.