Search papers, labs, and topics across Lattice.
3
0
6
1
Even frontier models like GPT-5 and Claude are highly susceptible to multi-turn jailbreaks that exploit their reliance on inferred user intent, and can even leak harmful information indirectly through "para-jailbreaking."
Forget hand-crafted rewards: VLLR leverages LLMs and VLMs to automatically generate dense rewards, boosting robotic task success rates by up to 56% on long-horizon tasks.
MLLMs can learn to be safer at inference time, without any additional training, by remembering and reasoning about past safety failures.