Search papers, labs, and topics across Lattice.
2
0
4
Layer-Selective Attention Caching achieves a 25% reduction in computation while enhancing audio quality retention by up to 6.7 times, revolutionizing efficiency in audio separation models.
Audio-specific KV cache eviction lets you compress LALMs by 40% with almost no accuracy loss, while generic methods fall apart.