Search papers, labs, and topics across Lattice.
Huazhong University of Science and Technology, Wuhan, China
2
49
4
28
VLMs can be significantly improved by reasoning over diverse, generated text inputs, rather than relying on restrictive, predefined templates.
Current video moment retrieval systems fail catastrophically when given irrelevant queries, but this work introduces a method to detect and reject such queries, preventing potentially dangerous false retrievals.