Search papers, labs, and topics across Lattice.
Department of Artificial Intelligence, Korea University
3
0
8
Achieving fine-grained semantic alignment in text-to-video generation is now possible with a model that explicitly verifies every prompt condition against visual evidence.
Decomposing text prompts into semantic units and using VQA for fine-grained self-reflection dramatically improves image generation quality, especially for complex compositions.
Unlock the potential of your offline RL data: a new framework achieves state-of-the-art performance on D4RL benchmarks by quantifying and leveraging data uncertainty with a computationally efficient Rank-One MIMO architecture.