Search papers, labs, and topics across Lattice.
Tencent Inc
4
0
5
Aligning diffusion models with human preferences just got a fidelity upgrade: DRM leverages the generative backbone itself for rewards, unlocking step-wise guidance that boosts image quality.
Existing video datasets fail to capture the complexity of human interactions in diverse scenes, but OmniHuman offers a new benchmark to train and evaluate models on more realistic human-centric video generation.
Audio-Omni can edit sound, music, and speech with a single model, rivaling specialized systems and unlocking capabilities like knowledge-augmented reasoning and zero-shot cross-lingual control.
Achieve high-fidelity, temporally coherent video editing without paired training data by combining sparse semantic control with dense motion and texture synthesis.