HITMar 16, 2026arXiv:2603.15020

MER-Bench: A Comprehensive Benchmark for Multimodal Meme Reappraisal

Yiqi Nie, Fei Wang, Junjie Chen, Kun Li, Yudi Cai, Dan Guo, Chenglong Li, Meng Wang

AI Summary

The paper introduces Meme Reappraisal, a new multimodal generation task focused on transforming negatively framed memes into constructive ones while preserving key structural and semantic elements. To facilitate research in this area, the authors present MER-Bench, a benchmark dataset of real-world memes with detailed multimodal annotations, including emotion labels, rewritten text, and visual editing specifications. They also propose an MLLM-as-a-Judge evaluation framework to assess modality-level quality, affective control, structural fidelity, and global alignment, revealing limitations in existing systems.

Key Contribution

Can AI transform a grumpy cat meme into a beacon of positivity while keeping the cat recognizable?

Abstract

Memes represent a tightly coupled, multimodal form of social expression, in which visual context and overlaid text jointly convey nuanced affect and commentary. Inspired by cognitive reappraisal in psychology, we introduce Meme Reappraisal, a novel multimodal generation task that aims to transform negatively framed memes into constructive ones while preserving their underlying scenario, entities, and structural layout. Unlike prior works on meme understanding or generation, Meme Reappraisal requires emotion-controllable, structure-preserving multimodal transformation under multiple semantic and stylistic constraints. To support this task, we construct MER-Bench, a benchmark of real-world memes with fine-grained multimodal annotations, including source and target emotions, positively rewritten meme text, visual editing specifications, and taxonomy labels covering visual type, sentiment polarity, and layout structure. We further propose a structured evaluation framework based on a multimodal large language model (MLLM)-as-a-Judge paradigm, decomposing performance into modality-level generation quality, affect controllability, structural fidelity, and global affective alignment. Extensive experiments across representative image-editing and multimodal-generation systems reveal substantial gaps in satisfying the constraints of structural preservation, semantic consistency, and affective transformation. We believe MER-Bench establishes a foundation for research on controllable meme editing and emotion-aware multimodal generation. Our code is available at: https://github.com/one-seven17/MER-Bench.

Eval Frameworks & Benchmarks Multimodal Models Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

MER-Bench: A Comprehensive Benchmark for Multimodal Meme Reappraisal

Related Papers