Search papers, labs, and topics across Lattice.
The paper addresses the problem of semantic hallucinations in unpaired day-to-night image translation, where objects like traffic signs are incorrectly synthesized. They propose a framework with a dual-head discriminator for hallucination detection via semantic segmentation and class-specific prototypes to act as semantic anchors. By integrating this into a Schrodinger Bridge-based translation model and iteratively pushing hallucinated features away from class prototypes, the method achieves significant improvements in downstream object detection, particularly for hallucination-prone classes.
Unpaired image translation gets a boost with a new method that suppresses hallucinated objects, leading to a 15.5% mAP improvement in day-to-night domain adaptation on BDD100K.
Day-to-night unpaired image translation is important to downstream tasks but remains challenging due to large appearance shifts and the lack of direct pixel-level supervision. Existing methods often introduce semantic hallucinations, where objects from target classes such as traffic signs and vehicles, as well as man-made light effects, are incorrectly synthesized. These hallucinations significantly degrade downstream performance. We propose a novel framework that detects and suppresses hallucinations of target-class features during unpaired translation. To detect hallucination, we design a dual-head discriminator that additionally performs semantic segmentation to identify hallucinated content in background regions. To suppress these hallucinations, we introduce class-specific prototypes, constructed by aggregating features of annotated target-domain objects, which act as semantic anchors for each class. Built upon a Schrodinger Bridge-based translation model, our framework performs iterative refinement, where detected hallucination features are explicitly pushed away from class prototypes in feature space, thus preserving object semantics across the translation trajectory.Experiments show that our method outperforms existing approaches both qualitatively and quantitatively. On the BDD100K dataset, it improves mAP by 15.5% for day-to-night domain adaptation, with a notable 31.7% gain for classes such as traffic lights that are prone to hallucinations.