Search papers, labs, and topics across Lattice.
Om AI Research, Binjiang Institute of Zhejiang University
2
0
5
VLMs and VGMs reveal a surprising complementarity in spatial intelligence tasks, with a simple fusion of their features outperforming either model alone.
MLLMs still fail at complex visual workflows: even the best models struggle to navigate multi-step reasoning chains involving visual conditions, achieving only 53.33 Path F1 on the new MM-CondChain benchmark.