Adobe ResearchHoward UniversityApr 29, 2026arXiv:2604.26186

FASH-iCNN: Making Editorial Fashion Identity Inspectable Through Multimodal CNN Probing

Morayo Danielle Adeyemi, Ryan A. Rossi, Franck Dernoncourt

AI Summary

FASH-iCNN, a multimodal CNN trained on a large dataset of Vogue runway images, is introduced to identify the fashion house, era, and color tradition of a garment. The system achieves high accuracy in identifying fashion houses (78.2% top-1), decades (88.6% top-1), and years (58.3% top-1), demonstrating its ability to capture editorial fashion identity. Analysis of visual channels reveals that texture and luminance are the primary carriers of editorial identity, with color playing a less significant role.

Key Contribution

Forget color, texture is the secret sauce of high fashion, revealed by a model that can pinpoint a garment's fashion house with surprising accuracy.

Abstract

Fashion AI systems routinely encode the aesthetic logic of specific houses, editors, and historical moments without disclosing it. We present FASH-iCNN, a multimodal system trained on 87,547 Vogue runway images across 15 fashion houses spanning 1991-2024 that makes this cultural logic inspectable. Given a photograph of a garment, the system recovers which house produced it, which era it belongs to, and which color tradition it reflects. A clothing-only model identifies the fashion house at 78.2% top-1 across 14 houses, the decade at 88.6% top-1, and the specific year at 58.3% top-1 across 34 years with a mean error of just 2.2 years. Probing which visual channels carry this signal reveals a sharp dissociation: removing color costs only 10.6pp of house identity accuracy, while removing texture costs 37.6pp, establishing texture and luminance as the primary carriers of editorial identity. FASH-iCNN treats editorial culture as the signal rather than background noise, identifying which houses, eras, and color traditions shaped each output so that users can see not just what the system predicts but which houses, editors, and historical moments are encoded in that prediction.

Computer Vision Interpretability & Mechanistic Interp Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

FASH-iCNN: Making Editorial Fashion Identity Inspectable Through Multimodal CNN Probing

Related Papers