Search papers, labs, and topics across Lattice.
Nanjing University
3
0
5
Achieve real-time, proactive video understanding with StreamOV, which uses bounded memory and a novel response trigger to overcome the limitations of offline methods.
Task-aware localization, using attention cues from both source and target image streams, significantly reduces over-editing in instruction-based image editing, even when applied to strong diffusion transformer backbones.
Current video LLMs falter when faced with the demands of real-time interaction, a gap RIVER Bench directly addresses by providing a challenging new evaluation framework.