Search papers, labs, and topics across Lattice.
This paper investigates the impact of visual priming on cooperative behavior in VLMs using the Iterated Prisoner's Dilemma, exposing models to images depicting kindness/aggression and color-coded reward matrices. Results demonstrate that VLMs' decision-making can be significantly swayed by both image content and color cues, revealing vulnerabilities in their cooperative strategies. Mitigation strategies like prompt engineering, CoT, and visual token reduction show varying degrees of success across different VLM architectures.
VLMs playing the Prisoner's Dilemma can be manipulated into selfish behavior simply by showing them images of aggression or reward matrices with specific color schemes.
As Vision-Language Models (VLMs) become increasingly integrated into decision-making systems, it is essential to understand how visual inputs influence their behavior. This paper investigates the effects of visual priming on VLMs'cooperative behavior using the Iterated Prisoner's Dilemma (IPD) as a test scenario. We examine whether exposure to images depicting behavioral concepts (kindness/helpfulness vs. aggressiveness/selfishness) and color-coded reward matrices alters VLM decision patterns. Experiments were conducted across multiple state-of-the-art VLMs. We further explore mitigation strategies including prompt modifications, Chain of Thought (CoT) reasoning, and visual token reduction. Results show that VLM behavior can be influenced by both image content and color cues, with varying susceptibility and mitigation effectiveness across models. These findings not only underscore the importance of robust evaluation frameworks for VLM deployment in visually rich and safety-critical environments, but also highlight how architectural and training differences among models may lead to distinct behavioral responses-an area worthy of further investigation.