Search papers, labs, and topics across Lattice.
University, University of Southern California, Seoul National University
1
0
3
MLLMs struggle to generalize in Video Temporal Grounding not just due to unseen concepts, but because visual domain shift breaks their ability to link temporal localization with entity attention – a problem EVIDENT solves by explicitly routing adaptation through visual entity evidence.