Search papers, labs, and topics across Lattice.
Xi'an Jiaotong University
1
0
3
Despite showing promise in reading raw height data, today's MLLMs often fail to translate geometric perception into reliable semantic reasoning about natural scenes, even performing worse than RGB-only models when both modalities are needed.