Search papers, labs, and topics across Lattice.
This paper introduces MOFA-VTON, a novel virtual try-on method that enhances clothing adaptation through user-drawn sketches, allowing for diverse dressing options tailored to individual preferences. By implementing a mask construction strategy that transforms sketches into a dual-region mask and employing layout adjustment blocks with a cross-attention mechanism, the method refines the spatial arrangement of clothing on the human body. Extensive experiments on the VITON-HD and DressCode datasets show that MOFA-VTON significantly outperforms existing state-of-the-art approaches, offering greater flexibility in virtual try-on scenarios.
Users can now sketch their clothing preferences, leading to a virtual try-on experience that adapts clothing styles with unprecedented flexibility.
Virtual try-on aims to fit an in-shop clothing image onto a specific human body. An optimal virtual try-on method should provide diverse and flexible dressing options, accurately reflecting the varied wearing styles encountered in real-life scenarios, tailored to individual preferences and fashion aspirations. However, current methods predominantly perform a direct replacement of the original clothing with the target clothing, following the same dressing pattern. This limited control over clothing adaptation may result in fixed and monotonous try-on outputs. To delve into More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On, we propose a novel virtual try-on method, termed MOFA-VTON, which allows adjustment for clothing adaptations in try-on results through simple sketches by users. Specifically, we first design a mask construction strategy that transforms user-drawn curve sketches into a dual-region mask, replacing the traditional clothing-agnostic mask and providing fine-grained layout guidance for the subsequent generation process. Further, we propose layout adjustment blocks that utilize the cross-attention mechanism to independently learn layout correspondences for upper and lower regions of the human body, refining the spatial arrangement of the two regions. With these implementations, our method enables flexible and fine-grained adaptations of target clothing, overcoming the constraints of a fixed layout. Extensive experiments on VITON-HD and DressCode datasets demonstrate that our proposed MOFA-VTON outperforms previous state-of-the-art methods and provides more fashion possibilities for virtual try-on.