Search papers, labs, and topics across Lattice.
1
0
3
Achieve safe RL exploration from day one without conservative optimization or hyperparameter tuning, by probabilistically regulating a new policy based on a safe reference policy.