Search papers, labs, and topics across Lattice.
2
0
4
Generate realistic and controllable videos of humans interacting with objects using only sparse motion cues, like wrist positions and object bounding boxes.
Achieve unified image generation by progressively disentangling and weaving together concept and localization representations within a diffusion framework, outperforming prior methods on diverse tasks.