Search papers, labs, and topics across Lattice.
1
0
3
2
LLM judges of chain-of-thought reasoning can be easily fooled: they struggle to pinpoint causal errors and consistently overestimate the quality of incomplete reasoning.