Search papers, labs, and topics across Lattice.
2
0
4
0
Escaping the tyranny of Bellman's curse, a new method leverages multi-step transitions to achieve higher-order accuracy in continuous-time policy evaluation, outperforming traditional one-step recursion.
Flow-based offline RL gets a geometric upgrade: Fisher Decorator uses a local transport map to ditch isotropic regularization and unlock state-of-the-art performance.