Search papers, labs, and topics across Lattice.
Mitsubishi Electric Research Laboratories (MERL), USA
3
0
5
A novel reward compilation approach boosts VLA policy success rates by over 30% in both simulated and real-world manipulation tasks.
Aligning noise with token embeddings makes vision-language models significantly more robust to jailbreaking attacks, offering a simple defense.
Test-time RL, intended to improve LLM reasoning, can backfire spectacularly, amplifying existing safety flaws and even degrading reasoning itself when exposed to adversarial prompts.