Search papers, labs, and topics across Lattice.
AI Innovation, DeCoDE Lab, Red Hat
2
0
3
Length-adjusted tail entropy can serve as a powerful signal for solution quality, enabling significant performance boosts in inference-time scaling across complex domains.
Trading a fraction of inference compute for a threefold reduction in training costs, sGPO redefines efficiency in RLVR training.