Search papers, labs, and topics across Lattice.
1
0
3
2
On-policy RL for machine learning engineering agents is now practical, thanks to a synthetic sandbox that slashes execution time by 13x while boosting performance by up to 67%.