Search papers, labs, and topics across Lattice.
Columbia University
2
0
6
Forget GPU-centric designs: AMMA slashes attention latency by 15x and energy consumption by 7x with a memory-centric architecture for long-context LLMs.
Frontier LLMs can unlock substantial performance gains in scientific domains by refining and completing raw scientific text, leading to a +8.40 point improvement on domain-aligned tasks.