Search papers, labs, and topics across Lattice.
1
0
3
2
Finally, a reinforcement learning algorithm, PGP, can provably find near-optimal policies that respect safety and resource constraints, even when the policy space is non-convex.