Search papers, labs, and topics across Lattice.
Xiaohongshu Inc
2
0
5
Transforming the KV cache from a monolithic structure into a dynamic, head-aware system could revolutionize LLM serving efficiency and scalability.
Real-world robots forget how to fold towels after learning to pick-and-place, but this work shows experience replay can help, if you do it right.