Search papers, labs, and topics across Lattice.
This paper addresses the lack of robust datasets for evaluating cultural alignment in LLMs by reviewing limitations of existing datasets and proposing new design guidelines for annotators. They construct a novel dataset adhering to these guidelines and conduct contrastive experiments to demonstrate its superior discriminative power in distinguishing between culturally specialized and general LLMs. The results show that the new dataset effectively identifies models better aligned with specific cultural contexts.
Current cultural bias evaluations of LLMs rely on datasets that lack the nuance to distinguish between genuine cultural understanding and superficial mimicry, but this new dataset changes that.
Although the cultural (mis)alignment of Large Language Models (LLMs) has attracted increasing attention -- often framed in terms of cultural bias -- until recently there has been limited work on the design and development of datasets for cultural assessment. Here, we review existing approaches to such datasets and identify their main limitations. To address these issues, we propose design guidelines for annotators and report on the construction of a dataset built according to these principles. We further present a series of contrastive experiments conducted with this dataset. The results demonstrate that our design yields test sets with greater discriminative power, effectively distinguishing between models specialized for a given culture and those that are not, ceteris paribus.