Search papers, labs, and topics across Lattice.
1
0
2
Forget policy gradients: Value Gradient Flow (VGF) offers a simpler, more scalable way to align LLMs by directly optimizing value functions via optimal transport.