Search papers, labs, and topics across Lattice.
Max Planck Institute for Informatics, SIC, VIA Research Center
3
0
5
LVLMs can self-detect and correct object hallucinations by focusing on specific image regions, offering a simple, training-free fix.
Ditch the language priors: SSL-R1 unlocks verifiable rewards for MLLM reinforcement learning directly from images, using self-supervision to solve visual puzzles.
Semantic scene understanding can be injected into novel view synthesis to generate more plausible and consistent images, especially under long-range camera motion, improving FID scores by up to 15%.