Search papers, labs, and topics across Lattice.
Yale University {aniketh.g, manasi.patwardhan}@tcs.com
1
0
3
2
Reference-guided LLM evaluators can boost alignment in non-verifiable domains, enabling self-improvement to rival reward model training.