Search papers, labs, and topics across Lattice.
Harbin Institute of Technology
1
0
3
Even the best large vision-language models struggle with multi-image reasoning, scoring only 50% on a new benchmark designed to challenge their capabilities.