BUPTNorthwesternTeleAIFeb 26, 2026arXiv:2602.22727

HulluEdit: Single-Pass Evidence-Consistent Subspace Editing for Mitigating Hallucinations in Large Vision-Language Models

Yang-Sen Lin, Yangguang Lin, Quan Fang, Quan Fang, Yufei Li, Yufei Li, Jiachen Sun, Jiachen Sun, Junyu Gao, Junyu Gao, Jitao Sang

AI Summary

The paper introduces HulluEdit, a single-pass, reference-free framework to mitigate object hallucination in Large Vision-Language Models (LVLMs) by selectively suppressing hallucinatory patterns. HulluEdit decomposes hidden states into orthogonal subspaces representing visual evidence, conflicting priors, and residual uncertainty, enabling targeted intervention on the prior subspace. Experiments demonstrate state-of-the-art hallucination reduction on POPE and CHAIR benchmarks, while preserving general capabilities and maintaining efficient inference, outperforming contrastive decoding and static subspace editing baselines.

Key Contribution

By surgically removing "hallucination patterns" from a model's hidden state, HulluEdit offers a reference-free, single-pass method to dramatically reduce object hallucinations in LVLMs without sacrificing visual grounding.

Abstract

Object hallucination in Large Vision-Language Models (LVLMs) significantly hinders their reliable deployment. Existing methods struggle to balance efficiency and accuracy: they often require expensive reference models and multiple forward passes, or apply static edits that risk suppressing genuine visual evidence. To address this, we introduce HulluEdit, a single-pass, reference-free intervention framework. Our core innovation is orthogonal subspace editing: we decompose the hidden states of the model into orthogonal subspaces - visual evidence, conflicting priors, and residual uncertainty - enabling selective suppression of hallucinatory patterns without interfering with visual grounding. This approach mathematically guarantees that edits applied to the prior subspace leave the visual component entirely unaffected. Extensive experiments show that HulluEdit achieves state-of-the-art hallucination reduction on benchmarks including POPE and CHAIR across diverse architectures, while preserving general capabilities on MME and maintaining efficient inference. Our method consistently outperforms contrastive decoding and static subspace editing baselines, offering a new pathway toward more trustworthy LVLMs.

Computer Vision Multimodal Models

Citation Metrics

Citations0

Influential citations0

References31

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

HulluEdit: Single-Pass Evidence-Consistent Subspace Editing for Mitigating Hallucinations in Large Vision-Language Models

Related Papers