Search papers, labs, and topics across Lattice.
The paper introduces PackMonitor, a novel decoding-time intervention method to eliminate package hallucinations in LLMs used for dependency recommendation. PackMonitor leverages the decidability of package validity against authoritative lists, employing a Context-Aware Parser to trigger intervention during installation command generation and a Package-Name Intervenor to constrain the decoding space. Experiments across five LLMs demonstrate PackMonitor achieves zero package hallucination rates with low latency and preserved model capabilities, without requiring any training.
Achieve zero package hallucinations from LLMs in dependency recommendation by monitoring the decoding process and intervening with an authoritative package list.
As Large Language Models (LLMs) are increasingly integrated into software development workflows, their trustworthiness has become a critical concern. However, in dependency recommendation scenarios, the reliability of LLMs is undermined by widespread package hallucinations, where models often recommend hallucinated packages. Recent studies have proposed a range of approaches to mitigate this issue. Nevertheless, existing approaches typically merely reduce hallucination rates rather than eliminate them, leaving persistent software security risks. In this work, we argue that package hallucinations are theoretically preventable based on the key insight that package validity is decidable through finite and enumerable authoritative package lists. Building on this, we propose PackMonitor, the first approach capable of fundamentally eliminating package hallucinations by continuously monitoring the model's decoding process and intervening when necessary. To implement this in practice, PackMonitor addresses three key challenges: (1) determining when to trigger intervention via a Context-Aware Parser that continuously monitors model outputs and selectively activates intervening only during installation command generation; (2) resolving how to intervene by employing a Package-Name Intervenor that strictly limits the decoding space to an authoritative package list; and (3) ensuring monitoring efficiency through a DFA-Caching Mechanism that enables scalability to millions of packages with negligible overhead. Extensive experiments on five widely used LLMs demonstrate that PackMonitor is a training-free, plug-and-play solution that consistently reduces package hallucination rates to zero while maintaining low-latency inference and preserving original model capabilities.