Feb 25, 2026arXiv:2602.21877

How to Take a Memorable Picture? Empowering Users with Actionable Feedback

Francesco Laiti, Davide Talon, Jacopo Staiano

AI Summary

This paper introduces Memorability Feedback (MemFeed), a novel task where models provide actionable guidance to users to enhance image memorability at capture time. They propose MemCoach, a training-free approach using Multimodal Large Language Models (MLLMs) and a teacher-student steering strategy to generate natural language suggestions for improvement. The authors also create MemBench, a new benchmark dataset for evaluating MemFeed, and demonstrate that MemCoach outperforms zero-shot models in improving memorability based on this benchmark.

Key Contribution

Forget filters – now AI can tell you exactly how to take a more memorable photo, offering actionable advice like "emphasize facial expression" in real-time.

Abstract

Image memorability, i.e., how likely an image is to be remembered, has traditionally been studied in computer vision either as a passive prediction task, with models regressing a scalar score, or with generative methods altering the visual input to boost the image likelihood of being remembered. Yet, none of these paradigms supports users at capture time, when the crucial question is how to improve a photo memorability. We introduce the task of Memorability Feedback (MemFeed), where an automated model should provide actionable, human-interpretable guidance to users with the goal to enhance an image future recall. We also present MemCoach, the first approach designed to provide concrete suggestions in natural language for memorability improvement (e.g., "emphasize facial expression," "bring the subject forward"). Our method, based on Multimodal Large Language Models (MLLMs), is training-free and employs a teacher-student steering strategy, aligning the model internal activations toward more memorable patterns learned from a teacher model progressing along least-to-most memorable samples. To enable systematic evaluation on this novel task, we further introduce MemBench, a new benchmark featuring sequence-aligned photoshoots with annotated memorability scores. Our experiments, considering multiple MLLMs, demonstrate the effectiveness of MemCoach, showing consistently improved performance over several zero-shot models. The results indicate that memorability can not only be predicted but also taught and instructed, shifting the focus from mere prediction to actionable feedback for human creators.

Computer Vision Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

How to Take a Memorable Picture? Empowering Users with Actionable Feedback

Related Papers