Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University
1
0
3
Decoupling temporal and spatial reasoning in video grounding unlocks significant performance gains, outperforming existing MLLM-based methods by a large margin.