Search papers, labs, and topics across Lattice.
BLOCK is a bi-stage pipeline for generating Minecraft skins from character concepts, using a large multimodal model (MLLM) for 3D preview synthesis and a fine-tuned FLUX.2 model for skin decoding. The pipeline employs a novel EvolveLoRA curriculum, progressively fine-tuning LoRA adapters from text-to-image to image-to-image to preview-to-skin generation. The open-source release of BLOCK, including prompt templates and fine-tuned weights, enables reproducible character-to-skin generation.
Craft pixel-perfect Minecraft skins from just a character concept with BLOCK, an open-source pipeline that leverages MLLMs and progressive LoRA fine-tuning.
We present \textbf{BLOCK}, an open-source bi-stage character-to-skin pipeline that generates pixel-perfect Minecraft skins from arbitrary character concepts. BLOCK decomposes the problem into (i) a \textbf{3D preview synthesis stage} driven by a large multimodal model (MLLM) with a carefully designed prompt-and-reference template, producing a consistent dual-panel (front/back) oblique-view Minecraft-style preview; and (ii) a \textbf{skin decoding stage} based on a fine-tuned FLUX.2 model that translates the preview into a skin atlas image. We further propose \textbf{EvolveLoRA}, a progressive LoRA curriculum (text-to-image $\rightarrow$ image-to-image $\rightarrow$ preview-to-skin) that initializes each phase from the previous adapter to improve stability and efficiency. BLOCK is released with all prompt templates and fine-tuned weights to support reproducible character-to-skin generation.