Search papers, labs, and topics across Lattice.
The paper introduces a certification protocol for multi-agent communication based on the stimulus-meaning model, aiming to ensure shared understanding of terms and mitigate semantic drift. Agents are tested on shared observable events, and terms are certified if disagreement falls below a statistical threshold, enabling "core-guarded reasoning" with provably bounded disagreement. Experiments demonstrate that core-guarding significantly reduces disagreement in both simulated and language model-based environments.
Guaranteeing consistent communication between AI agents is now possible: a new certification protocol slashes disagreement by up to 96% by ensuring agents share a common understanding of terms.
Multiagent AI systems require consistent communication, but we lack methods to verify that agents share the same understanding of the terms used. Natural language is interpretable but vulnerable to semantic drift, while learned protocols are efficient but opaque. We propose a certification protocol based on the stimulus-meaning model, where agents are tested on shared observable events and terms are certified if empirical disagreement falls below a statistical threshold. In this protocol, agents restricting their reasoning to certified terms ("core-guarded reasoning") achieve provably bounded disagreement. We also outline mechanisms for detecting drift (recertification) and recovering shared vocabulary (renegotiation). In simulations with varying degrees of semantic divergence, core-guarding reduces disagreement by 72-96%. In a validation with fine-tuned language models, disagreement is reduced by 51%. Our framework provides a first step towards verifiable agent-to-agent communication.