Search papers, labs, and topics across Lattice.
This paper introduces an analytical framework for quantifying computational reliability in Extreme Edge Computing (XEC) environments, where streaming workloads are distributed across volatile consumer devices. The framework derives closed-form expressions for reliability, defined as the probability that instantaneous capacity meets demand, under both minimal information and historical data regimes, and extends to multi-device deployments with series, parallel, and partitioned workload configurations. Validation using real-time object detection with YOLO11m in emulated XEC environments demonstrates strong agreement between analytical predictions, Monte Carlo simulations, and empirical measurements.
Forget simulations – this analytical framework lets you predict the reliability of distributed AI inference on the chaotic extreme edge using just a few equations.
Extreme Edge Computing (XEC) distributes streaming workloads across consumer-owned devices, exploiting their proximity to users and ubiquitous availability. Many such workloads are AI-driven, requiring continuous neural network inference for tasks like object detection and video analytics. Distributed Inference (DI), which partitions model execution across multiple edge devices, enables these streaming services to meet strict throughput and latency requirements. Yet consumer devices exhibit volatile computational availability due to competing applications and unpredictable usage patterns. This volatility poses a fundamental challenge: how can we quantify the probability that a device, or ensemble of devices, will maintain the processing rate required by a streaming service? This paper presents an analytical framework for computational reliability in XEC, defined as the probability that instantaneous capacity meets demand at a specified Quality of Service (QoS) threshold. We derive closed-form reliability expressions under two information regimes: Minimal Information (MI), requiring only declared operational bounds, and historical data, which refines estimates via Maximum Likelihood Estimation from past observations. The framework extends to multi-device deployments, providing reliability expressions for series, parallel, and partitioned workload configurations. We derive optimal workload allocation rules and analytical bounds for device selection, equipping orchestrators with tractable tools to evaluate deployment feasibility and configure distributed streaming systems. We validate the framework using real-time object detection with YOLO11m model as a representative DI streaming workload; experiments on emulated XED environments demonstrate close agreement between analytical predictions, Monte Carlo sampling, and empirical measurements across diverse capacity and demand configurations.