Search papers, labs, and topics across Lattice.
University of Maryland, Princeton University, Columbia University ⋆, Harvard University, Lawrence Livermore National Laboratory
2
0
4
LCLMs redefine the efficiency of long-context inference, achieving superior compression without sacrificing model quality.
LLMs can leverage "sleep" to distill long contexts into fast weights, unlocking superior reasoning without sacrificing inference latency.