Search papers, labs, and topics across Lattice.
National University of Defense Technology, Beijing University of Posts and Telecommunications
2
0
4
Achieve faster and more accurate remote sensing interpretation by intelligently pruning visual tokens based on task-specific semantic and geometric importance, without any training.
MLLMs struggle to effectively zoom into relevant details in ultra-high-resolution remote sensing imagery, but a new staged training framework can teach them when and where to focus for substantial accuracy gains.