Search papers, labs, and topics across Lattice.
Wuhan University, National Engineering Research Center for Multimedia Software, Hubei Key Laboratory of Multimedia and Network Communication Engineering
2
0
4
Current VideoQA models falter in understanding complex narratives, but StoryVideoQA and PlotTree redefine how we tackle deep video comprehension.
Seedance 2.0 leapfrogs existing models by unifying multi-modal inputs (text, image, audio, video) into a single architecture for generating high-quality, longer-duration audio-video content.