Mar 30, 2026arXiv:2603.29013

Wherefore Art Thou? Provenance-Guided Automatic Online Debugging with Lumos

Jingyuan Chen, Lei Zhang, Leon Schuermann, Leon Schuermann, Gongqi Huang, Ravi Netravali, Ravi Netravali, Amit Levy, Amit Levy

AI Summary

Lumos is introduced as an online debugging framework that automatically identifies the computational history (provenance) linking bug symptoms to root causes in distributed systems. It uses static analysis to guide instrumentation, focusing on program state relevant to bug provenance, and records this information on-demand with low overhead. Evaluation shows Lumos effectively provides developers with sufficient evidence to identify root causes from a few bug occurrences, addressing the challenge of debugging complex distributed systems in production.

Key Contribution

Pinpointing root causes in distributed systems just got easier: Lumos automatically exposes the computational history of bugs with low overhead, even with limited bug occurrences.

Abstract

Debugging distributed systems in-production is inevitable and hard. Myriad interactions between concurrent components in modern, complex and large-scale systems cause non-deterministic bugs that offline testing and verification fail to capture. When bugs surface at runtime, their root causes may be far removed from their symptoms. To identify a root cause, developers often need evidence scattered across multiple components and traces. Unfortunately, existing tools fail to quickly and automatically record useful provenance information at low overheads, leaving developers to manually perform the onerous evidence collection task. Lumos is an online debugging framework that exposes application-level bug provenances--the computational history linking symptoms of an incident to their root causes. Lumos leverages dependency-guided instrumentation powered by static analysis to identify program state related to a bug's provenance, and exposes them via lightweight on-demand recording. Lumos provides developers with enough evidence to identify a bug's root cause, while incurring low runtime overhead, and given only a few occurrences of a bug.

Code Generation & Program Synthesis Distributed Systems & Hardware

Citation Metrics

Citations0

Influential citations0

References40

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Wherefore Art Thou? Provenance-Guided Automatic Online Debugging with Lumos

Related Papers