Search papers, labs, and topics across Lattice.
2
0
4
Existing safety guardrails for text-to-image models can backfire, inadvertently amplifying other types of harm, but this new method adaptively steers generation to resolve these conflicts and reduce overall harmful content.
Image-to-video models can be jailbroken by hiding malicious instructions in seemingly harmless reference images, achieving an 83.5% attack success rate on commercial systems.