Search papers, labs, and topics across Lattice.
1
0
3
Unsupervised RL for text generation doesn't have to collapse into gibberish: rewarding relative information gain between specialist and generalist policies unlocks meaningful content creation.