Search papers, labs, and topics across Lattice.
3
0
7
0
Forget about re-balancing losses – gradient geometry is the key to unlearning in LLMs without sacrificing retention.
PPO can be made sample-efficient and stable for long-horizon reasoning in LLMs by treating the problem as a sequence-level contextual bandit, sidestepping the need for computationally expensive multi-sampling.
Finally, a diffusion model lets you puppeteer multiple objects in a video with just text prompts, opening the door to complex scene editing.