Search papers, labs, and topics across Lattice.
2
0
4
Adam's faster convergence isn't just empirical luck: its second-moment normalization provably yields sharper tails in high-probability convergence guarantees compared to SGD.
RLVR's success in long-horizon reasoning hinges on a smooth difficulty spectrum, where mastering easier sub-problems unlocks the ability to tackle harder ones, avoiding frustrating grokking plateaus.