Search papers, labs, and topics across Lattice.
The Hebrew University of Jerusalem, IBM Research, The Hebrew University of Jerusalem, IBM Research
1
0
2
7
Using a top or bottom-performing LLM as an anchor in "LLM-as-a-judge" benchmarks can dramatically skew results, making the choice of a mediocre anchor key to reliable evaluation.