Search papers, labs, and topics across Lattice.
This paper investigates the ability of LLMs to emulate the cultural values of specific demographic subgroups, using Singapore as a case study and the World Values Survey (WVS) for ground truth. They find that even GPT-4.1 achieves only 57.4% accuracy in predicting subgroup preferences, and that fine-tuning on structured numerical preferences improves accuracy on unseen subgroups by 17.4%. However, fine-tuning exacerbates pre-existing biases, widening the performance gap between subgroups.
Fine-tuning LLMs to emulate specific cultural values improves average accuracy, but simultaneously widens the disparity in performance between demographic subgroups.
Despite their global prevalence, many Large Language Models (LLMs) are aligned to a monolithic, often Western-centric set of values. This paper investigates the more challenging task of fine-grained value alignment: examining whether LLMs can emulate the distinct cultural values of demographic subgroups. Using Singapore as a case study and the World Values Survey (WVS), we examine the value landscape and show that even state-of-the-art models like GPT-4.1 achieve only 57.4% accuracy in predicting subgroup modal preferences. We construct a dataset of over 20,000 samples to train and evaluate a range of models. We demonstrate that simple fine-tuning on structured numerical preferences yields substantial gains, improving accuracy on unseen, out-of-distribution subgroups by an average of 17.4%. These gains partially transfer to open-ended generation. However, we find significant pre-existing performance biases, where models better emulate young, male, Chinese, and Christian personas. Furthermore, while fine-tuning improves average performance, it widens the disparity between subgroups when measured by distance-aware metrics. Our work offers insights into the limits and fairness implications of subgroup-level cultural alignment.