Search papers, labs, and topics across Lattice.
1
0
3
9
LLM judges are surprisingly susceptible to subtle rubric manipulations that can induce significant preference drift, even while maintaining benchmark performance, creating a stealthy attack surface for biasing model alignment.