Search papers, labs, and topics across Lattice.
1
0
3
0
Squeeze 3D vision transformers: XStreamVGGT slashes memory consumption by 4.42x and accelerates inference by 5.48x via pruning and quantization of the KV cache, all with negligible performance loss.