Search papers, labs, and topics across Lattice.
University of Copenhagen
4
20
5
2
Achieve world-consistent video generation by directly optimizing geometry in the latent space of pre-trained video diffusion models, sidestepping costly RGB-space operations and architectural changes.
The field of video understanding is rapidly shifting from isolated pipelines to unified models capable of adapting to diverse downstream tasks, demanding a re-evaluation of current approaches.
Unlock precise, training-free color control in text-to-image models by directly manipulating the latent space's emergent Hue, Saturation, and Lightness structure.
Unlocking VLM interpretability, sparse autoencoders let you directly steer multimodal LLMs like LLaVA by intervening on CLIP's vision encoder.