Search papers, labs, and topics across Lattice.
5
0
8
Forget rigid shot-music mappings: BEAT's elastic alignment framework finally captures the dynamic rhythm of professional movie trailer editing.
By intelligently pruning attention heads based on their spatial or temporal roles and adaptively routing denoising steps through the network, PARE achieves significant computational savings in video generation without sacrificing quality.
Autonomous research agents need to learn from their mistakes and adapt, not just generate papers, and this framework shows how to make that happen.
Task-aware localization, using attention cues from both source and target image streams, significantly reduces over-editing in instruction-based image editing, even when applied to strong diffusion transformer backbones.
MLLMs can achieve 4x faster inference without sacrificing accuracy by intelligently focusing on only the image regions relevant to the query.