Search papers, labs, and topics across Lattice.
CustomTex addresses the challenge of high-fidelity, instance-level 3D scene texturing by using reference images to guide the generation of a unified texture map. The method employs a dual-distillation approach, with semantic-level distillation using instance cross-attention for reference alignment and pixel-level distillation for high visual fidelity, both within a Variational Score Distillation framework. Results show CustomTex achieves better instance consistency, sharpness, and reduced artifacts compared to existing text-driven and other methods.
Forget generic textures – CustomTex lets you clone real-world object appearances onto your 3D scenes with uncanny fidelity.
The creation of high-fidelity, customizable 3D indoor scene textures remains a significant challenge. While text-driven methods offer flexibility, they lack the precision for fine-grained, instance-level control, and often produce textures with insufficient quality, artifacts, and baked-in shading. To overcome these limitations, we introduce CustomTex, a novel framework for instance-level, high-fidelity scene texturing driven by reference images. CustomTex takes an untextured 3D scene and a set of reference images specifying the desired appearance for each object instance, and generates a unified, high-resolution texture map. The core of our method is a dual-distillation approach that separates semantic control from pixel-level enhancement. We employ semantic-level distillation, equipped with an instance cross-attention, to ensure semantic plausibility and ``reference-instance''alignment, and pixel-level distillation to enforce high visual fidelity. Both are unified within a Variational Score Distillation (VSD) optimization framework. Experiments demonstrate that CustomTex achieves precise instance-level consistency with reference images and produces textures with superior sharpness, reduced artifacts, and minimal baked-in shading compared to state-of-the-art methods. Our work establishes a more direct and user-friendly path to high-quality, customizable 3D scene appearance editing.