Search papers, labs, and topics across Lattice.
2
0
5
SemantiCache achieves up to 2.61x faster decoding and reduces memory footprint without sacrificing model performance by compressing KV caches along semantic boundaries.
VLMs can't count blocks because they lack a view-consistent spatial interface, but decomposing scenes into orthographic projections fixes it.