Search papers, labs, and topics across Lattice.
This paper evaluates the strategic decision-making capabilities of six LLMs in geopolitical crisis simulations, comparing their behavior to human participants across multiple rounds of interaction. The study analyzes action alignment with human choices, risk calibration based on the severity of selected actions, and the framing of arguments used to justify decisions. The key finding is that while LLMs initially mimic human decision patterns, they diverge over time and consistently exhibit a normative-cooperative framing focused on stability and risk mitigation, lacking adversarial reasoning.
LLMs in geopolitical simulations start out mimicking human decision-making, but quickly diverge into distinct, overly cooperative strategies that lack adversarial reasoning.
Large language models (LLMs) are increasingly proposed as agents in strategic decision environments, yet their behavior in structured geopolitical simulations remains under-researched. We evaluate six popular state-of-the-art LLMs alongside results from human results across four real-world crisis simulation scenarios, requiring models to select predefined actions and justify their decisions across multiple rounds. We compare models to humans in action alignment, risk calibration through chosen actions'severity, and argumentative framing grounded in international relations theory. Results show that models approximate human decision patterns in base simulation rounds but diverge over time, displaying distinct behavioural profiles and strategy updates. LLM explanations for chosen actions across all models exhibit a strong normative-cooperative framing centered on stability, coordination, and risk mitigation, with limited adversarial reasoning.