Search papers, labs, and topics across Lattice.
1
0
3
Get better image captions without more data: reinforcement learning can train vision-language models to focus on image details by maximizing the similarity between images retrieved using the generated captions.