Search papers, labs, and topics across Lattice.
1
0
2
SemantiCache achieves up to 2.61x faster decoding and reduces memory footprint without sacrificing model performance by compressing KV caches along semantic boundaries.