Search papers, labs, and topics across Lattice.
1
5
3
Token-level alignment, powered by a novel distillation approach, lets LLMs learn faster and better by avoiding the pitfalls of response-level reward optimization.