HUSTTelecomApr 21, 2026arXiv:2604.19412

VCE: A zero-cost hallucination mitigation method of LVLMs via visual contrastive editing

Yanbin Huang, Yisen Li, Guiyao Tie, Xiaoye Qu, Pan Zhou, Hongfei Wang, Zhaofan Zou, Xuelong Li

AI Summary

The paper introduces Visual Contrastive Editing (VCE), a zero-cost post-hoc method to mitigate object hallucination in Large Vision-Language Models (LVLMs). VCE leverages Singular Value Decomposition (SVD) on model activations in response to contrastive visual perturbations to identify and suppress hallucination subspaces through targeted parameter edits. Experiments show VCE reduces object hallucination across benchmarks without fine-tuning or labeled data, preserving computational efficiency.

Key Contribution

Object hallucination in LVLMs can be significantly reduced *after* training, without any extra data or compute.

Abstract

Large vision-language models (LVLMs) frequently suffer from Object Hallucination (OH), wherein they generate descriptions containing objects that are not actually present in the input image. This phenomenon is particularly problematic in real-world applications such as medical imaging and autonomous driving, where accuracy is critical. Recent studies suggest that the hallucination problem may stem from language priors: biases learned during pretraining that cause LVLMs to generate words based on their statistical co-occurrence. To mitigate this problem, we propose Visual Contrastive Editing (VCE), a novel post-hoc method that identifies and suppresses hallucinatory tendencies by analyzing the model's response to contrastive visual perturbations. Using Singular Value Decomposition (SVD), we decompose the model's activation patterns to isolate hallucination subspaces and apply targeted parameter edits to attenuate its influence. Unlike existing approaches that require fine-tuning or labeled data, VCE operates as a label-free intervention, making it both scalable and practical for deployment in resource-constrained settings. Experimental results demonstrate that VCE effectively reduces object hallucination across multiple benchmarks while maintaining the model's original computational efficiency.

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

VCE: A zero-cost hallucination mitigation method of LVLMs via visual contrastive editing

Related Papers