Search papers, labs, and topics across Lattice.
1
0
3
Today's best multimodal LLMs still struggle to grasp fine-grained details and reason across multiple entities in images, even with access to external knowledge.