Search papers, labs, and topics across Lattice.
4
0
10
5
Current LLM judges show a troubling reliability gap in long-form evaluations, raising questions about their effectiveness in real-world applications.
LLMs can beat state-of-the-art tensor compilers on individual subgraphs, but struggle with consistency, revealing a path to unlock their full potential through targeted training.
Meta's new hierarchical indexing method lets you deploy massive recommendation models without sacrificing speed or accuracy, and it turns out the index itself highlights a high-quality subset of data perfect for test-time training.
Injecting demographic attributes directly into LLM hidden states can drastically improve the diversity and realism of public opinion simulations.