Search papers, labs, and topics across Lattice.
2
0
5
Audio-specific KV cache eviction lets you compress LALMs by 40% with almost no accuracy loss, while generic methods fall apart.
SER models, often assumed to generalize well to synthesized speech, actually fail miserably, revealing their reliance on spurious correlations rather than genuine emotional understanding.