Search papers, labs, and topics across Lattice.
2
0
4
1
By rethinking RLHF, MicroCoder-GRPO enables smaller code generation models to rival larger counterparts, achieving significant performance gains and revealing 34 training insights.
Forget massive datasets – targeted training on a smaller, carefully curated dataset of challenging competitive programming problems yields 3x faster gains in code generation performance.