Search papers, labs, and topics across Lattice.
5 papers published across 3 labs.
Generate minute-long, high-fidelity animations without visual degradation or character drift using a surprisingly simple latent flow restoration technique.
Current video understanding models struggle with long-horizon robustness and non-speech audio, as revealed by the new OmniPro benchmark designed for comprehensive omni-modal proactive evaluation.
Multimodal LLMs struggle to pinpoint objects from nouns alone, but SWIM training realigns vision and language to outperform visual-prompt methods.
UMMs struggle with cross-modal consistency not from a lack of shared representations, but from misaligned latent space transformations, which LatentUMM fixes.
Achieve series-level cinematic remaking with Soap2Soap, a multi-agent framework that maintains narrative fidelity and character consistency across hundreds of shots, outperforming commercial video generation APIs.