Feb 19, 2026arXiv:2602.17106

Toward Trustworthy Evaluation of Sustainability Rating Methodologies: A Human-AI Collaborative Framework for Benchmark Dataset Construction

Xiaoran Cai, Wang Yang, Xiyu Ren, Chekun Law, Rohit Sharma, Peng Qi

AI Summary

This paper introduces a human-AI collaborative framework, STRIDE and SR-Delta, to address the lack of comparability and credibility in sustainability ratings across different agencies. STRIDE uses LLMs guided by principled criteria and a scoring system to construct firm-level benchmark datasets, while SR-Delta provides a discrepancy-analysis framework to identify areas for adjustment. The framework enables scalable and comparable assessment of sustainability rating methodologies, promoting more trustworthy and harmonized sustainability assessments.

Key Contribution

A human-AI collaboration framework offers a path to more trustworthy and comparable sustainability ratings by creating benchmark datasets that expose inconsistencies in existing methodologies.

Abstract

Sustainability or ESG rating agencies use company disclosures and external data to produce scores or ratings that assess the environmental, social, and governance performance of a company. However, sustainability ratings across agencies for a single company vary widely, limiting their comparability, credibility, and relevance to decision-making. To harmonize the rating results, we propose adopting a universal human-AI collaboration framework to generate trustworthy benchmark datasets for evaluating sustainability rating methodologies. The framework comprises two complementary parts: STRIDE (Sustainability Trust Rating & Integrity Data Equation) provides principled criteria and a scoring system that guide the construction of firm-level benchmark datasets using large language models (LLMs), and SR-Delta, a discrepancy-analysis procedural framework that surfaces insights for potential adjustments. The framework enables scalable and comparable assessment of sustainability rating methodologies. We call on the broader AI community to adopt AI-powered approaches to strengthen and advance sustainability rating methodologies that support and enforce urgent sustainability agendas.

Constitutional AI & AI Ethics Data Curation & Synthetic Data Eval Frameworks & Benchmarks

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Toward Trustworthy Evaluation of Sustainability Rating Methodologies: A Human-AI Collaborative Framework for Benchmark Dataset Construction

Related Papers