Search papers, labs, and topics across Lattice.
1
0
2
5
Constraining initial state representations with a simple Tanh activation and skip connections can significantly boost off-policy RL performance, rivaling more complex methods on continuous control tasks.