Search papers, labs, and topics across Lattice.
2
0
6
0
YOCO++ proves you can halve the KV cache size in LLMs and still beat a standard Transformer, thanks to a clever residual connection trick.
Multimodal models are often blind at birth: a new "Visual Attention Score" reveals they struggle to focus on visual inputs during cold-start, but a simple attention-guided fix can boost performance by 7%.