Search papers, labs, and topics across Lattice.
MBZUAI, EPITA
2
0
3
Q-learning regret bounds can be achieved without optimism, but are highly sensitive to the suboptimality gap, motivating a new smoothed exploration strategy.
Skip the expensive gradients: a simple VJP-free approximation lets you edit images and videos with diffusion models just as well as training-heavy approaches.