Same Attention, Different Truths: Put Logit-Lens over Visual Attention to Detect and Mitigate LVLM Object Hallucination
Abstract
Large Vision-Language Models (LVLMs) often suffer from object hallucination, generating objects that are absent from the image. Prior work largely attributes this to insufficient visual attention. However, in this work, we are surprised to find that both real and hallucinated objects receive equally strong visual attention in the model’s mid-to-late layers. This indicates that the key issue may not be how much the model attends, but what it attends to and why. To this end, we decode the visual features of high-attention regions using Logit Lens, and observe that high-attention regions corresponding to real objects can be correctly decoded to the target object token, whereas those for hallucinated objects cannot. Building on this, we identify two distinct hallucination mechanisms: (i) visual uncertainty, triggered by semantically similar or confusable regions, masking these regions eliminates the hallucination. (ii) contextual prior, triggered by strong co-occurrence priors, even when the initially attended region is masked, the hallucination persists and attention drifts to other regions. Based on these findings, we propose a simple yet effective training-free Detect–Mitigate framework comprising a Logit-Lens Consistency Check to detect hallucination and targeted remedies: High-Attention Regions Masking (HARM) for visual uncertainty hallucination, and Visual Evidence Enhanced Decoding (VEED) for contextual prior hallucination, which leverages genuine visual evidence to suppress erroneous priors. Our approach achieves state-of-the-art results on multiple hallucination benchmarks.