Search papers, labs, and topics across Lattice.
1
0
3
24
Ditching latent critics in offline RL unlocks state-of-the-art performance by directly backpropagating action-space gradients through a differentiable flow-based policy, enabling robust latent policy steering with minimal tuning.