Search papers, labs, and topics across Lattice.
This paper provides a theoretical analysis of the Elastic-Sketch data structure under stationary random streams, deriving closed-form expressions for the limiting distribution of counters and expected counting error as stream length approaches infinity. These expressions enable efficient, grid-based tuning of the heavy and Count-Min blocks' memory split and the eviction threshold. The authors also characterize the structure of the optimal eviction threshold, reducing the search space and revealing its dependence on the arrival distribution, validated through simulations on Zipf-distributed data.
Elastic-Sketch's performance hinges on stream characteristics and eviction thresholds, but this work cracks the code to near-optimal configuration by deriving closed-form expressions for its limiting behavior under stationary random streams.
\texttt{Elastic-Sketch} is a hash-based data structure for counting item's appearances in a data stream, and it has been empirically shown to achieve a better memory-accuracy trade-off compared to classical methods. This algorithm combines a \textit{heavy block}, which aims to maintain exact counts for a small set of dynamically \textit{elected} items, with a light block that implements \texttt{Count-Min} \texttt{Sketch} (\texttt{CM}) for summarizing the remaining traffic. The heavy block dynamics are governed by a hash function~$\beta$ that hashes items into~$m_1$ buckets, and an \textit{eviction threshold}~$\lambda$, which controls how easily an elected item can be replaced. We show that the performance of \texttt{Elastic-Sketch} strongly depends on the stream characteristics and the choice of~$\lambda$. Since optimal parameter choices depend on unknown stream properties, we analyze \texttt{Elastic-Sketch} under a \textit{stationary random stream} model -- a common assumption that captures the statistical regularities observed in real workloads. Formally, as the stream length goes to infinity, we derive closed-form expressions for the limiting distribution of the counters and the resulting expected counting error. These expressions are efficiently computable, enabling practical grid-based tuning of the heavy and \texttt{CM} blocks memory split (via $m_1$) and the eviction threshold~$\lambda$. We further characterize the structure of the optimal eviction threshold, substantially reducing the search space and showing how this threshold depends on the arrival distribution. Extensive numerical simulations validate our asymptotic results on finite streams from the Zipf distribution.