Search papers, labs, and topics across Lattice.
1
0
3
2
Stop averaging over noisy robot data: PTR selectively trusts training samples based on how well their post-action consequences align with learned representations, leading to more robust offline policy learning.