Search papers, labs, and topics across Lattice.
2
0
4
0
Bidirectional interaction between enhanced understanding, controllable spatial editing, and novel-view-assisted reasoning enables a unified multimodal model to achieve spatial intelligence beyond general visual competence.
Existing image editing models fall short when it comes to precise spatial manipulations, but a new benchmark and dataset reveal the path to closing the gap.