Search papers, labs, and topics across Lattice.
School of Artificial Intelligence, Wuhan University, Wuhan, China
1
0
3
0
LLM inference gets a 2x speed boost without training, thanks to a clever technique that merges retrieval with logit-based speculation.