Search papers, labs, and topics across Lattice.
5
0
8
0
HELMSMAN slashes hardware costs by over 90% while enabling billion-scale index rebuilds in mere hours, revolutionizing ANNS for large-scale applications.
Current audio-visual generation models struggle to maintain coherence and alignment when scaling to minute-long content, a problem exposed by the new LongAV-Compass benchmark.
Ditching text-based chain-of-thought unlocks better audio-visual reasoning by interleaving textual steps with a unified latent space that preserves dense sensory information.
Test-time adaptation of vision-language models can actually *hurt* performance when modalities shift asymmetrically; MG-MTTA fixes this by explicitly modeling modality reliability.
Forget view consistency tricks – language-driven 3D editing leaps forward by explicitly modeling semantic relationships between 2D edits and 3D Gaussians.