Search papers, labs, and topics across Lattice.
1
0
2
Log-barrier regularization can provably rescue policy optimization from getting stuck in suboptimal regions by structurally enforcing exploration, without sacrificing sample complexity.