CUHKHKPolyUHKUSTMar 16, 2026arXiv:2603.15555

Learning Latent Proxies for Controllable Single-Image Relighting

Haoze Zheng, Zihao Wang, Xianfeng Wu, Yajing Bai, Yexin Liu, Yun Li, Xiaogang Xu, Harry Yang

AI Summary

LightCtrl, a novel single-image relighting approach, uses a few-shot latent proxy encoder to extract compact material-geometry cues from limited PBR supervision, combined with a lighting-aware mask to guide a diffusion model towards shading-relevant pixels. To address the scarcity of PBR data, the method refines the proxy branch using a DPO-based objective that enforces physical consistency in the predicted cues. The approach, trained on the new ScaLight dataset, achieves state-of-the-art photorealistic relighting with accurate continuous control, outperforming existing diffusion and intrinsic-based methods.

Key Contribution

Forget full intrinsic decomposition: LightCtrl achieves state-of-the-art single-image relighting by learning sparse, physically meaningful latent proxies from limited PBR data.

Abstract

Single-image relighting is highly under-constrained: small illumination changes can produce large, nonlinear variations in shading, shadows, and specularities, while geometry and materials remain unobserved. Existing diffusion-based approaches either rely on intrinsic or G-buffer pipelines that require dense and fragile supervision, or operate purely in latent space without physical grounding, making fine-grained control of direction, intensity, and color unreliable. We observe that a full intrinsic decomposition is unnecessary and redundant for accurate relighting. Instead, sparse but physically meaningful cues, indicating where illumination should change and how materials should respond, are sufficient to guide a diffusion model. Based on this insight, we introduce LightCtrl that integrates physical priors at two levels: a few-shot latent proxy encoder that extracts compact material-geometry cues from limited PBR supervision, and a lighting-aware mask that identifies sensitive illumination regions and steers the denoiser toward shading relevant pixels. To compensate for scarce PBR data, we refine the proxy branch using a DPO-based objective that enforces physical consistency in the predicted cues. We also present ScaLight, a large-scale object-level dataset with systematically varied illumination and complete camera-light metadata, enabling physically consistent and controllable training. Across object and scene level benchmarks, our method achieves photometrically faithful relighting with accurate continuous control, surpassing prior diffusion and intrinsic-based baselines, including gains of up to +2.4 dB PSNR and 35% lower RMSE under controlled lighting shifts.

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Learning Latent Proxies for Controllable Single-Image Relighting

Related Papers