Search papers, labs, and topics across Lattice.
The paper introduces Fréchet Distance loss (FD-loss), a method for optimizing Fréchet Distance in representation space by decoupling the population size for FD estimation from the batch size for gradient computation. Optimizing FD-loss improves visual quality in generators, achieving 0.72 FID on ImageNet 256x256 with a one-step generator when using the Inception feature space. The authors also show that FD-loss can repurpose multi-step generators into strong one-step generators and introduce FDr^k, a multi-representation metric, to address the issue of FID misranking visual quality.
Fréchet Distance, previously deemed impractical for training, can now be effectively optimized in representation space, leading to surprisingly high-quality image generation and a new metric that better aligns with human perception.
We show that Fréchet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term this approach FD-loss. Optimizing FD-loss reveals several surprising findings. First, post-training a base generator with FD-loss in different representation spaces consistently improves visual quality. Under the Inception feature space, a one-step generator achieves0.72 FID on ImageNet 256x256. Second, the same FD-loss repurposes multi-step generators into strong one-step generators without teacher distillation, adversarial training or per-sample targets. Third, FID can misrank visual quality: modern representations can yield better samples despite worse Inception FID. This motivates FDr^k, a multi-representation metric. We hope this work will encourage further exploration of distributional distances in diverse representation spaces as both training objectives and evaluation metrics for generative models.