Search papers, labs, and topics across Lattice.
University of Artificial Intelligence
2
0
4
Fixing your parallelism strategy while tuning batch size (or vice versa) leaves performance on the table: COPUS adaptively co-tunes both for faster LLM training.
Forget simple scaling laws: the compute-optimal number of parallel rollouts in LLM RL plateaus, revealing distinct mechanisms for easy vs. hard problems.