Search papers, labs, and topics across Lattice.
4
0
7
5
Video2LoRA slashes visual-token load by up to 1,500x while maintaining performance, revolutionizing video processing in vision-language models.
Current VLMs ace diagram question answering, but DRAGON reveals they often fake it, failing to ground their answers in the actual visual evidence.
Open-source diffusion models can now achieve state-of-the-art illumination control rivaling closed-source alternatives, thanks to a novel training pipeline and dataset.
Finally, AI can generate hour-long videos with consistent characters and backgrounds, thanks to a new framework that nails seamless transitions between shots.