Search papers, labs, and topics across Lattice.
1
0
2
Unlock $\sqrt{N}$ regret in offline policy learning, even with complex policy classes, by trading off policy and environment complexity.