Search papers, labs, and topics across Lattice.
2
0
4
Forget scaling laws: a large VLM strategically paired with a smaller model's reasoning tokens can rival the performance of a much larger, monolithic model.
Token ranking heuristics for LLM prefill are surprisingly unstable across layers, but simply aggregating attention scores across layers can dramatically improve performance.