Search papers, labs, and topics across Lattice.
1
0
3
Recurrent Transformers let you trade model depth for width, slashing KV cache memory footprint and inference latency without sacrificing performance.