Search papers, labs, and topics across Lattice.
The paper introduces CDH-Bench, a new benchmark to evaluate commonsense-driven hallucination in VLMs, where models override visual evidence with commonsense priors. The benchmark covers counting, relational, and attribute anomalies, testing models' ability to follow visual evidence against commonsense. Experiments on frontier VLMs reveal vulnerabilities to prior-driven normalization, quantified by metrics like Counterfactual Accuracy Drop and Commonsense Collapse Rate.
Even state-of-the-art vision-language models still struggle to reconcile visual evidence with commonsense, often hallucinating based on prior knowledge instead of what they actually see.
Vision-language models (VLMs) achieve strong performance on many benchmarks, yet a basic reliability question remains underexplored: when visual evidence conflicts with commonsense, do models follow what is shown or what commonsense suggests? A characteristic failure in this setting is that the model overrides visual evidence and outputs the commonsense alternative. We term this phenomenon \textbf{commonsense-driven hallucination} (CDH). To evaluate it, we introduce \textbf{CDH-Bench}, a benchmark designed to create explicit \textbf{visual evidence--commonsense conflicts}. CDH-Bench covers three dimensions: \textit{counting anomalies}, \textit{relational anomalies}, and \textit{attribute anomalies}. We evaluate frontier VLMs under \textit{binary Question Answering (QA)} and \textit{multiple-choice QA}, and report metrics including \textit{Counterfactual Accuracy} (CF-Acc), \textit{Commonsense Accuracy} (CS-Acc), \textit{Counterfactual Accuracy Drop} (CFAD), \textit{Commonsense Collapse Rate} (CCR), and \textit{Relative Prior Dependency} (RPD). Results show that even strong models remain vulnerable to prior-driven normalization under visual evidence--commonsense conflict. CDH-Bench provides a controlled diagnostic of visual fidelity under visual evidence--commonsense conflict.