Search papers, labs, and topics across Lattice.
Xidian University
2
0
4
6
Generate detailed 3D indoor scenes from short text descriptions with SDesc3D, a framework that leverages multi-view structural priors and regional functionality to overcome the limitations of explicit semantic cues.
VLMs struggle to connect the dots between dynamic drone footage and satellite imagery, highlighting a critical gap in their spatial reasoning abilities.