Search papers, labs, and topics across Lattice.
1
0
2
Z-Reward achieves nearly 90% human preference accuracy by transforming subjective visual preferences into nuanced score distributions, outperforming traditional reward models.