Search papers, labs, and topics across Lattice.
4
0
6
0
Ditch slow, error-prone autoregressive video captioning: diffusion models can now generate captions in parallel, rivaling autoregressive quality with a significant speed boost.
Achieve state-of-the-art micro-expression recognition with a Transformer-based architecture that slashes computational complexity without sacrificing accuracy.
Medical VQA gets a boost: Mamba-based models enhanced with knowledge graphs now outperform existing methods by better associating lesion features with diagnostic criteria.
LLMs can now capture an author's unique voice in translations, thanks to a multi-agent system guided by a "Stylistic Feature Spectrum" derived from wavelet transforms.