Search papers, labs, and topics across Lattice.
National Anti-Counterfeit Engineering Research Center, Huazhong University of Science and Technology, V generation refers to text-and-image-to-video generation, where both text and image prompts are used as inputs.
1
0
3
Image-to-video models can be jailbroken by hiding malicious instructions in seemingly harmless reference images, achieving an 83.5% attack success rate on commercial systems.