Search papers, labs, and topics across Lattice.
OrgForge is introduced as a multi-agent simulation framework designed to generate verifiable synthetic corporate corpora for evaluating retrieval-augmented generation (RAG) pipelines. The framework enforces a strict separation between a deterministic Python engine maintaining ground truth events and LLMs generating surface prose, ensuring consistency and traceability across documents. OrgForge simulates various corporate communication channels like Slack, JIRA, and email, all linked to a shared event log, and includes mechanisms for tracking causal chains and detecting recurring failures.
Finally, a way to generate synthetic corporate datasets with guaranteed ground truth for RAG evaluation, sidestepping the legal and consistency issues of real-world and purely LLM-generated data.
Evaluating retrieval-augmented generation (RAG) pipelines requires corpora where ground truth is knowable, temporally structured, and cross-artifact properties that real-world datasets rarely provide cleanly. Existing resources such as the Enron corpus carry legal ambiguity, demographic skew, and no structured ground truth. Purely LLM-generated synthetic data solves the legal problem but introduces a subtler one: the generating model cannot be prevented from hallucinating facts that contradict themselves across documents.We present OrgForge, an open-source multi-agent simulation framework that enforces a strict physics-cognition boundary: a deterministic Python engine maintains a SimEvent ground truth bus; large language models generate only surface prose, constrained by validated proposals. An actor-local clock enforces causal timestamp correctness across all artifact types, eliminating the class of timeline inconsistencies that arise when timestamps are sampled independently per document. We formalize three graph-dynamic subsystems stress propagation via betweenness centrality, temporal edge-weight decay, and Dijkstra escalation routing that govern organizational behavior independently of any LLM. Running a configurable N-day simulation, OrgForge produces interleaved Slack threads, JIRA tickets, Confluence pages, Git pull requests, and emails, all traceable to a shared, immutable event log. We additionally describe a causal chain tracking subsystem that accumulates cross-artifact evidence graphs per incident, a hybrid reciprocal-rank-fusion recurrence detector for identifying repeated failure classes, and an inbound/outbound email engine that routes vendor alerts, customer complaints, and HR correspondence through gated causal chains with probabilistic drop simulation. OrgForge is available under the MIT license.