Search papers, labs, and topics across Lattice.
University of California, San Diego
5
0
8
1
Forget GPU-centric designs: AMMA slashes attention latency by 15x and energy consumption by 7x with a memory-centric architecture for long-context LLMs.
LLMs still struggle to reason in context when cultural and linguistic nuances are involved, achieving only 44% accuracy on a new grounded benchmark spanning 14 languages.
LLMs can leapfrog state-of-the-art scientific algorithms and human-designed solutions, but only if you scale the evaluation loop, not just the model.
Ditch the training overhead and still get up to 4.79x faster diffusion sampling with Spectrum, a training-free feature forecasting method that actually maintains image quality.
LLMs still have a long way to go in AI-aided chip design, with even the best models achieving surprisingly low scores on the new ChipBench benchmark for Verilog generation and reference model creation.