Search papers, labs, and topics across Lattice.
This paper introduces the Culture-Related Open Questions (CROQ) dataset to probe cultural biases in LLMs, revealing a surprising preference for Japanese culture. Experiments demonstrate that this bias is exacerbated when prompting in English and other high-resource languages, while multilingual prompting yields more diverse cultural responses. Further analysis indicates that this cultural bias emerges primarily during supervised fine-tuning, rather than pre-training.
LLMs aren't just Western-centric; they have a peculiar obsession with Japan, and this bias is amplified by English-language prompting.
LLMs have been showing limitations when it comes to cultural coverage and competence, and in some cases show regional biases such as amplifying Western and Anglocentric viewpoints. While there have been works analysing the cultural capabilities of LLMs, there has not been specific work on highlighting LLM regional preferences when it comes to cultural-related questions. In this work, we propose a new dataset based on a comprehensive taxonomy of Culture-Related Open Questions (CROQ). The results show that, contrary to previous cultural bias work, LLMs show a clear tendency towards countries such as Japan. Moveover, our results show that when prompting in languages such as English or other high-resource ones, LLMs tend to provide more diverse outputs and show less inclinations towards answering questions highlighting countries for which the input language is an official language. Finally, we also investigate at which point of LLM training this cultural bias emerges, with our results suggesting that the first clear signs appear after supervised fine-tuning, and not during pre-training.