StabilityMay 28, 2026arXiv:2605.30257

Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning

Ciara Rowles, Reshinth Adithyan, Nikhil Pinnaparaju, Vikram S. Voleti, Mark Boss

AI Summary

Stable-Layers fine-tunes a pre-trained image layer decomposition model (Qwen-Image-Layered) using reinforcement learning, eliminating the need for paired supervision. They use Flow-GRPO with LoRA, sampling multiple decompositions and scoring them with a VLM. A two-stage evaluation pipeline, pairing structured per-sample scoring with grid-based calibration, addresses the challenge of narrow VLM judgement bands, leading to improved layer separation and reduced artifacts.

Key Contribution

Unleashing VLMs as the sole judge, Stable-Layers trains image decomposition models with RL, achieving superior layer separation and fewer artifacts without paired supervision.

Abstract

We present Stable-Layers, a reinforcement learning framework that eliminates the need for paired supervision by fine-tuning a pretrained layer decomposition model using only feedback from a vision-language model (VLM). Starting from Qwen-Image-Layered, we apply Flow-GRPO with LoRA adaptation, sampling multiple candidate decompositions per image, scoring them with a VLM, and optimising the policy from group-relative advantages. The key challenge lies in designing a reliable reward signal: VLMs scoring samples in isolation tend to compress their judgements into a narrow band, leaving GRPO with little within-group variance to learn from. We address this with a two-stage evaluation pipeline that pairs structured per-sample scoring across five edit-centric criteria with a grid-based calibration step in which the VLM re-scores all candidates side-by-side. Stable-Layers produces decompositions with stronger layer separation, fewer blank or artifact-heavy layers, and lower per-layer reconstruction error on the Crello dataset compared to the base model.

Computer Vision Multimodal Models RLHF & Preference Learning

Citation Metrics

Citations0

Influential citations0

References39

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning

Related Papers