Search papers, labs, and topics across Lattice.
The paper trains an unsupervised generative model on a new dataset of painterly objects with systematic variations in gloss and artistic style to investigate their representation in learned models. They find a hierarchical latent space where gloss is disentangled from other appearance factors and varies across styles. Based on this representation, they introduce a lightweight adapter to connect the latent space to a latent-diffusion model, enabling non-photorealistic image synthesis with fine-grained gloss and style control.
Key contribution not extracted.
Humans can infer material characteristics of objects from their visual appearance, and this ability extends to artistic depictions, where similar perceptual strategies guide the interpretation of paintings or drawings. Among the factors that define material appearance, gloss, along with color, is widely regarded as one of the most important, and recent studies indicate that humans can perceive gloss independently of the artistic style used to depict an object. To investigate how gloss and artistic style are represented in learned models, we train an unsupervised generative model on a newly curated dataset of painterly objects designed to systematically vary such factors. Our analysis reveals a hierarchical latent space in which gloss is disentangled from other appearance factors, allowing for a detailed study of how gloss is represented and varies across artistic styles. Building on this representation, we introduce a lightweight adapter that connects our style- and gloss-aware latent space to a latent-diffusion model, enabling the synthesis of non-photorealistic images with fine-grained control of these factors. We compare our approach with previous models and observe improved disentanglement and controllability of the learned factors.