Feb 25, 2026arXiv:2602.21698

E-comIQ-ZH: A Human-Aligned Dataset and Benchmark for Fine-Grained Evaluation of E-commerce Posters with Chain-of-Thought

AI Summary

The authors introduce E-comIQ-ZH, a new framework for evaluating the quality of Chinese e-commerce posters, addressing the limitations of existing methods that fail to capture functional criteria and subtle textual artifacts specific to Chinese characters. They construct E-comIQ-18k, a dataset featuring multi-dimensional scores and Chain-of-Thought rationales calibrated by experts, and train E-comIQ-M, a specialized evaluation model. Experiments demonstrate that E-comIQ-M aligns more closely with human expert judgment, enabling scalable automated assessment of e-commerce posters.

Key Contribution

Finally, a benchmark that can tell if your AI-generated Taobao ad actually makes sense, going beyond just "does it look pretty?"

Abstract

Generative AI is widely used to create commercial posters. However, rapid advances in generation have outpaced automated quality assessment. Existing models emphasize generic esthetics or low level distortions and lack the functional criteria required for e-commerce design. It is especially challenging for Chinese content, where complex characters often produce subtle but critical textual artifacts that are overlooked by existing methods. To address this, we introduce E-comIQ-ZH, a framework for evaluating Chinese e-commerce posters. We build the first dataset E-comIQ-18k to feature multi dimensional scores and expert calibrated Chain of Thought (CoT) rationales. Using this dataset, we train E-comIQ-M, a specialized evaluation model that aligns with human expert judgment. Our framework enables E-comIQ-Bench, the first automated and scalable benchmark for the generation of Chinese e-commerce posters. Extensive experiments show our E-comIQ-M aligns more closely with expert standards and enables scalable automated assessment of e-commerce posters. All datasets, models, and evaluation tools will be released to support future research in this area.Code will be available at https://github.com/4mm7/E-comIQ-ZH.

Eval Frameworks & Benchmarks Multimodal Models Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

E-comIQ-ZH: A Human-Aligned Dataset and Benchmark for Fine-Grained Evaluation of E-commerce Posters with Chain-of-Thought

Related Papers