Search papers, labs, and topics across Lattice.
1
0
2
5
Saturated LLM benchmarks can be revived without creating new datasets: a self-improving LLM judge in an elimination tournament recovers ranking signal and breaks ties.