Search papers, labs, and topics across Lattice.
Seoul National University
1
0
3
LVLMs can generate more factual and detailed image captions at a lower compute cost by reflecting on their past mistakes and systematically attending to overlooked details.