Search papers, labs, and topics across Lattice.
1
0
3
By explicitly modeling the latent human evaluation process, VRM offers a more robust reward model, sidestepping the pitfalls of spurious correlations that plague traditional methods.