Search papers, labs, and topics across Lattice.
WeChat Lab
3
0
4
Existing video datasets fail to capture the complexity of human interactions in diverse scenes, but OmniHuman offers a new benchmark to train and evaluate models on more realistic human-centric video generation.
Audio-Omni can edit sound, music, and speech with a single model, rivaling specialized systems and unlocking capabilities like knowledge-augmented reasoning and zero-shot cross-lingual control.
Achieve high-fidelity, temporally coherent video editing without paired training data by combining sparse semantic control with dense motion and texture synthesis.