Zeynep Akata

Munich Center for Machine Learning

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Computer Vision (5)Multimodal Models (5)Interpretability & Mechanistic Interp (4)Eval Frameworks & Benchmarks (1)

Frequent co-authors

Quentin Bouniot (3)Mateusz Pach (2)Serge Belongie (2)Rui Xiao (1)

Papers (6)

Mar 18, 2026

Munich Center for Machine LearningMar 18, 2026

From Drop-off to Recovery: A Mechanistic Analysis of Segmentation in MLLMs

MLLMs' image segmentation prowess isn't a given: a critical adapter layer actually *hurts* performance, with the LLM having to recover via attention-mediated refinement.

Zeynep Akata

Computer Vision Interpretability & Mechanistic Interp Multimodal Models

Munich Center for Machine LearningMar 18, 2026·also Google Research

FINER: MLLMs Hallucinate under Fine-grained Negative Queries

MLLMs are surprisingly prone to hallucinating subtle details, especially when asked about the absence of specific attributes or relationships within an image.

Rui Xiao, Sanghwan Kim, Yongqin Xian +2

Eval Frameworks & Benchmarks Multimodal Models

Mar 12, 2026

Mateusz Pach +4Mar 12, 2026·also KU, Munich Center for Machine Learning

The Latent Color Subspace: Emergent Order in High-Dimensional Chaos

Unlock precise, training-free color control in text-to-image models by directly manipulating the latent space's emergent Hue, Saturation, and Lightness structure.

Mateusz Pach, Jessica Bader, Quentin Bouniot +2

Computer Vision Interpretability & Mechanistic Interp Multimodal Models

Mar 9, 2026

Simone Carnemolla +6Mar 9, 2026·also Munich Center for Machine Learning

UNBOX: Unveiling Black-box visual models with Natural-language

You can now audit black-box vision models for biases and failure modes using only their output probabilities, thanks to a clever LLM-powered semantic search.

Simone Carnemolla, Chiara Russo, Simone Palazzo +4

Computer Vision Interpretability & Mechanistic Interp

Feb 26, 2026

Simon Roschmann +8Feb 26, 2026·also Munich Center for Machine Learning

SOTAlign: Semi-Supervised Alignment of Unimodal Vision and Language Models via Optimal Transport

Achieve meaningful vision-language model alignment with significantly less supervision by leveraging unpaired data via optimal transport.

Simon Roschmann, Simon Roschmann, Paul Krzakala +6

Computer Vision Multimodal Models Training Efficiency & Optimization

Apr 3, 2025

Mateusz Pach +4Apr 3, 2025·also KU, Munich Center for Machine Learning

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

Unlocking VLM interpretability, sparse autoencoders let you directly steer multimodal LLMs like LLaVA by intervening on CLIP's vision encoder.