Search papers, labs, and topics across Lattice.
1
20
2
8
LLMs can learn better from human feedback by exploring more creatively, thanks to a simple coin-flip counting method that encourages them to try new things.