Search papers, labs, and topics across Lattice.
2
7
3
5
World models can now self-improve by identifying their own prediction errors, thanks to a clever decomposition of action-conditioned prediction into easier-to-verify components.
Q-functions and implicit policy extraction are game-changers for batch online RL in robotics, unlocking significant performance gains over imitation-based approaches.