Apr 22, 2026arXiv:2604.20317

MD-Face: MoE-Enhanced Label-Free Disentangled Representation for Interactive Facial Attribute Editing

Xuan Cui, Yunfei Zhao, Bo Liu, Wei Duan, Xingrong Fan

AI Summary

MD-Face, a novel GAN-based framework, achieves label-free disentangled representation learning for facial attribute editing by employing a Mixture of Experts (MoE) backbone with dynamic expert allocation. A geometry-aware loss function, aligning semantic vectors with Semantic Boundary Vectors (SBV) via a Jacobian-based pushforward, further reduces attribute entanglement. Experiments on ProGAN and StyleGAN demonstrate that MD-Face achieves comparable performance to supervised methods while surpassing unsupervised baselines, offering improved image quality and lower latency than diffusion-based approaches.

Key Contribution

GANs can now edit faces with disentangled attributes *without* needing labeled data, rivaling supervised methods in quality and speed.

Abstract

GAN-based facial attribute editing is widely used in virtual avatars and social media but often suffers from attribute entanglement, where modifying one face attribute unintentionally alters others. While supervised disentangled representation learning can address this, it relies heavily on labeled data, incurring high annotation costs. To address these challenges, we propose MD-Face, a label-free disentangled representation learning framework based on Mixture of Experts (MoE). MD-Face utilizes a MoE backbone with a gating mechanism that dynamically allocates experts, enabling the model to learn semantic vectors with greater independence. To further enhance attribute entanglement, we introduce a geometry-aware loss, which aligns each semantic vector with its corresponding Semantic Boundary Vector (SBV) through a Jacobian-based pushforward method. Experiments with ProGAN and StyleGAN show that MD-Face outperforms unsupervised baselines and competes with supervised ones. Compared to diffusion-based methods, it offers better image quality and lower inference latency, making it ideal for interactive editing.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Citation Metrics

Citations0

Influential citations0

References31

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

MD-Face: MoE-Enhanced Label-Free Disentangled Representation for Interactive Facial Attribute Editing

Related Papers