May 6, 2026arXiv:2605.04507

Distilling Bayesian Belief States into Language Models for Auditable Negotiation

AI Summary

This paper introduces BOND, a framework for training negotiation agents that explicitly model and output beliefs about their opponent's preferences. A Bayesian teacher model scores dialogue contexts against possible opponent priority orderings and updates a posterior, which is then distilled into a smaller student LM that emits both actions and normalized posterior beliefs. Experiments on the CaSiNo dataset show BOND outperforms the state-of-the-art and achieves strong Brier scores for opponent-priority posteriors, demonstrating effective distillation of Bayesian belief states.

Key Contribution

You can distill interpretable Bayesian reasoning about opponent preferences into an 8B language model, outperforming much larger models and enabling detailed auditability of negotiation strategies.

Abstract

Negotiation agents must infer what their counterpart values, update those beliefs over dialogue turns, and choose actions under uncertainty. End-to-end large language models (LLMs) can imitate negotiation dialogue, but their opponent beliefs are usually implicit and difficult to inspect. We propose BOND (Bayesian Opponent-belief Negotiation Distillation), a framework for auditable negotiation. BOND consists of an LLM-based Bayesian teacher that scores dialogue contexts against the six possible opponent priority orderings, updates a posterior over those orderings, and uses the posterior for menu-based decision making, as well as a smaller 8B student language model that emits both negotiation actions and normalized posterior beliefs as tagged text. In the CaSiNo negotiation dataset, BOND outperforms the state-of-the-art and achieves mean Brier score 0.085 over opponent-priority posteriors. The distilled student preserves much of this belief signal, achieving Brier 0.114, below the uniform six-ordering reference of 5/36, approximately 0.139. Compared with a 70B structured-CoT baseline, the significantly smaller 8B student model yields substantially better elicited posterior calibration. We further showcase auditability through posterior trajectories, belief-versus-policy error decomposition, and posterior-prefix interventions. These diagnostics reveal that distillation preserves a scoreable belief report more strongly than causal belief-conditioned control, making weak belief-action coupling visible, not hidden.

Inference & Quantization Natural Language Processing Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References11

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Distilling Bayesian Belief States into Language Models for Auditable Negotiation

Related Papers