Search papers, labs, and topics across Lattice.
LIVIA, Dept. of Systems Engineering, ETS Montreal, Canada
3
0
5
Despite advances in multimodal deep learning, recognizing subtle emotional states like ambivalence and hesitancy from video remains a significant challenge, even for state-of-the-art models.
Forget noisy LLM-generated prompts: this method uses interpretable Action Units to guide CLIP for personalized, fine-grained video emotion recognition, achieving state-of-the-art results.
Forget synthetic data: VectorGym offers a new benchmark for SVG code generation, sketching, and editing with gold-standard human annotations, revealing surprising performance gaps in even the most powerful VLMs.