Search papers, labs, and topics across Lattice.
This paper validates and extends a cultural alignment framework for LLMs, demonstrating that cultural skew and the benefits of culture-specific prompting persist in open-weight models. They introduce prompt programming with DSPy to systematically tune cultural conditioning by optimizing against cultural-distance objectives derived from social science surveys. Experiments show that prompt optimization with DSPy often surpasses manual cultural prompt engineering, offering a more stable and transferable approach to culturally aligned LLM responses.
Optimizing prompts with DSPy can significantly improve cultural alignment in LLMs, outperforming manual prompt engineering and offering a more robust solution for mitigating cultural biases.
Culture shapes reasoning, values, prioritization, and strategic decision-making, yet large language models (LLMs) often exhibit cultural biases that misalign with target populations. As LLMs are increasingly used for strategic decision-making, policy support, and document engineering tasks such as summarization, categorization, and compliance-oriented auditing, improving cultural alignment is important for ensuring that downstream analyses and recommendations reflect target-population value profiles rather than default model priors. Previous work introduced a survey-grounded cultural alignment framework and showed that culture-specific prompting can reduce misalignment, but it primarily evaluated proprietary models and relied on manual prompt engineering. In this paper, we validate and extend that framework by reproducing its social sciences survey based projection and distance metrics on open-weight LLMs, testing whether the same cultural skew and benefits of culture conditioning persist outside closed LLM systems. Building on this foundation, we introduce use of prompt programming with DSPy for this problem-treating prompts as modular, optimizable programs-to systematically tune cultural conditioning by optimizing against cultural-distance objectives. In our experiments, we show that prompt optimization often improves upon cultural prompt engineering, suggesting prompt compilation with DSPy can provide a more stable and transferable route to culturally aligned LLM responses.