Search papers, labs, and topics across Lattice.
1
0
3
LLMs struggle to balance task completion with cultural norms in dynamic social simulations, revealing critical gaps in their cross-cultural robustness and highlighting the need for human oversight in automated benchmarking.