Search papers, labs, and topics across Lattice.
The paper demonstrates through Forward-In-Time-Only (FITO) analysis that the Unix filesystem abstraction of atomic state transitions is fundamentally flawed across the entire computing stack, from filesystem journaling to CPU behavior. It proves that no syscall-based persistence primitive can guarantee a commit boundary under failure due to inconsistencies between the syscall return value and actual persistence states. The analysis identifies temporal assumption leakage as the mechanism propagating this category mistake, leading to a recursive chain of non-atomic dependencies.
The Unix filesystem's promise of atomic operations is a mirage, leading to widespread data corruption and costly failures across major cloud providers and databases.
Unix tools such as ls, cp, mv, and rename expose a filesystem abstraction that appears to present a single, authoritative state evolving through atomic transitions. This abstraction is false. We present a systematic Forward-In-Time-Only (FITO) analysis demonstrating that the assumption of instantaneous atomic state transitions constitutes a category mistake at every layer of the computing stack -- from ext4 journaling and delayed allocation, through fsync failure semantics, NVMe Flush/FUA device behavior, and Linux restartable sequences, down to the x86-64 CPU's own inability to guarantee atomic supervisor entry under Non-Maskable Interrupts. We prove a formal impossibility result: no syscall-based persistence primitive can define a commit boundary under failure, because the syscall return value is consistent with multiple materially different persistence states across Linux filesystems. We identify cross-layer temporal assumption leakage as the structural mechanism by which the category mistake propagates, and show that the entire storage stack forms a recursive chain of non-atomic dependencies whose apparent atomicity reflects mathematical impossibility (Herlihy, 1991), not merely engineering deficiency. An appendix documents the real-world consequences: cascading cloud outages at Google, AWS, Meta, and Cloudflare driven by retry amplification; database corruption from fsync failures in PostgreSQL, etcd, and MySQL; silent data corruption at CERN, NetApp, and Meta; AI training waste consuming 12--43% of compute budgets at scale; and financial system failures totaling billions of dollars annually. These consequences trace to a single structural cause: systems designed around the FITO assumption, compensating for its failure with retry-and-recover protocols that amplify the very failures they attempt to mask.