Search papers, labs, and topics across Lattice.
This paper introduces a mathematical framework for evaluating trust in sources, including AI agents, based on the concept of "conviction," defined as the likelihood a source's stance will be validated by independent consensus. It formalizes sources as having generative and discriminative roles and defines reputation as the expected weighted signed conviction over a realm of claims. The authors argue that conviction, rather than correctness, is the principled basis for trust, especially for AI agents, and that continuous verification is crucial for reputation accrual.
Forget "trustworthiness" – the key to AI trust is verifiable "conviction," or the likelihood a model's claims will be independently validated.
The question of \emph{knowledge}, \emph{truth} and \emph{trust} is explored via a mathematical formulation of claims and sources. We define truth as the reproducibly perceived subset of knowledge, formalize sources as having both generative and discriminative roles, and develop a framework for reputation grounded in the \emph{conviction} -- the likelihood that a source's stance is vindicated by independent consensus. We argue that conviction, rather than correctness or faithfulness, is the principled basis for trust: it is regime-independent, rewards genuine contribution, and demands the transparent and self-sufficient perceptions that make external verification possible. We formalize reputation as the expected weighted signed conviction over a realm of claims, characterize its behavior across source-claim regimes, and identify continuous verification as both a theoretical necessity and a practical mechanism through which reputation accrues. The framework is applied to AI agents, which are identified as capable but error-prone sources for whom verifiable conviction and continuously accrued reputation constitute the only robust foundation for trust.