Search papers, labs, and topics across Lattice.
1
0
3
0
Current code reward models are myopic, mostly rewarding functional correctness, but Themis-RM learns to score code across multiple criteria and languages, opening the door to more nuanced and useful code generation.