Microsoft ResearchxAIDec 13, 2025arXiv:2512.12443

AI Transparency Atlas: Framework, Scoring, and Real-Time Model Card Evaluation Pipeline

Akhmadillo Mamirov, Faiaz Azmain, Hanyu Wang

AI Summary

The paper introduces a weighted transparency framework based on the EU AI Act and Stanford Transparency Index to evaluate AI model documentation, addressing the current fragmentation and inconsistency. They developed an automated multi-agent pipeline leveraging LLMs to extract documentation and score completeness across 50 models, revealing significant gaps, especially in safety-critical categories. The evaluation shows frontier labs achieve higher compliance (around 80%) compared to other providers (below 60%), highlighting areas for improvement in AI transparency.

Key Contribution

Most AI models are failing to disclose critical safety information like deception behaviors and hallucination risks, even from top labs.

Abstract

AI model documentation is fragmented across platforms and inconsistent in structure, preventing policymakers, auditors, and users from reliably assessing safety claims, data provenance, and version-level changes. We analyzed documentation from five frontier models (Gemini 3, Grok 4.1, Llama 4, GPT-5, and Claude 4.5) and 100 Hugging Face model cards, identifying 947 unique section names with extreme naming variation. Usage information alone appeared under 97 distinct labels. Using the EU AI Act Annex IV and the Stanford Transparency Index as baselines, we developed a weighted transparency framework with 8 sections and 23 subsections that prioritizes safety-critical disclosures (Safety Evaluation: 25%, Critical Risk: 20%) over technical specifications. We implemented an automated multi-agent pipeline that extracts documentation from public sources and scores completeness through LLM-based consensus. Evaluating 50 models across vision, multimodal, open-source, and closed-source systems cost less than $3 in total and revealed systematic gaps. Frontier labs (xAI, Microsoft, Anthropic) achieve approximately 80% compliance, while most providers fall below 60%. Safety-critical categories show the largest deficits: deception behaviors, hallucinations, and child safety evaluations account for 148, 124, and 116 aggregate points lost, respectively, across all evaluated models.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Open-Source Models & Weights

Citation Metrics

Citations0

Influential citations0

References11

Year2025

VenuearXiv.org

Related Papers

Finding related papers...

Search

AI Transparency Atlas: Framework, Scoring, and Real-Time Model Card Evaluation Pipeline

Related Papers