Search papers, labs, and topics across Lattice.
The authors introduce PhantomRun, a system and dataset for reproducing CI build failures in embedded open-source software, addressing the challenges of short-lived and heterogeneous CI build logs. PhantomRun achieves a 91.8% build reconstruction rate across 4628 failing CI runs while preserving execution outcomes in 98% of cases. By providing a unified abstraction layer for retrieving, storing, and re-executing CI builds, PhantomRun enables reproducible studies of CI failures.
Replaying CI failures in embedded systems is now possible at scale: PhantomRun reconstructs over 90% of failing builds, opening the door to systematic debugging and failure analysis.
Due to hardware-software co-development in embedded systems, continuous integration (CI) builds frequently fail because of complex cross-compilation, board configurations, and toolchain constraints. Although CI build logs contain valuable diagnostic information, they are short-lived and difficult to reuse due to heterogeneous runners, toolchains, and log formats. To address these challenges, we present PhantomRun, a unified abstraction layer and publicly reusable dataset that standardizes the retrieval, storage, and reproduction of CI build logs and metadata. Across 4628 failing CI runs, we reconstructed 91.8% of builds and preserved execution outcomes in 98% of evaluated cases. PhantomRun provides two core capabilities: retrieving the build log of any commit and faithfully re-executing the corresponding build in a controlled environment. By exposing all build artifacts and metadata in a uniform, machine-readable format, PhantomRun enables reproducible and longitudinal studies of CI failures. An empirical evaluation shows that reproduced builds closely match their originals, typically differing only in timestamps or minor nondeterministic reordering, demonstrating the feasibility of large-scale historical CI reconstruction.