Search papers, labs, and topics across Lattice.
This paper analyzes the software supply chain of AI systems, highlighting vulnerabilities across data acquisition, model training, inference, and the underlying substrate. It identifies four key structural gaps鈥攙erifiability, versioning, observability, and traceability鈥攖hat current AI systems fail to adequately address. The authors quantify the scale of this supply chain by analyzing a reference stack of 48 open-source projects, revealing a massive dependency graph of over 11,000 transitive packages and nearly 400 million lines of code.
AI systems are built on a software house of cards, with 400M lines of code and 11,000 dependencies, yet lack basic supply chain protections like versioning and verifiability.
AI systems rest on software with low integrity mechanisms, leaving AI systems exposed across every stage from data acquisition to final inference. This paper makes the AI supply chain a first-class object of analysis, decomposing it across four architectural layers: data acquisition, model training, model inference, and a cross-cutting substrate. Within these layers, we identify four structural gaps that traditional supply chain mechanisms do not address: verifiability, versioning, observability, and traceability.Current AI systems fall short on all of them: they carry undeclared behavioral couplings that no resolver enforces; they cannot be reverted back to known working assemblies; they degrade silently rather than surfacing breaking changes; and their lineage can hardly be approximated. To illustrate the scale of the software supply chain of AI, we measure a reference stack of 48 production-grade open-source projects, which declares 4,664 direct dependencies, resolves to 11,508 transitive packages, and totals roughly 392M lines of code.