Search papers, labs, and topics across Lattice.
This paper compares n-gram models with LSTMs and Transformers for next-activity prediction in event logs, finding that n-grams achieve comparable accuracy with significantly fewer resources and more stable performance. To improve n-gram performance without incurring the overhead of traditional ensemble methods, the authors introduce a "promotion algorithm" that dynamically selects between two active models during inference. Experiments on real-world datasets demonstrate that these ensembles match or exceed the accuracy of non-windowed neural models at a lower computational cost.
N-gram models can rival neural networks in event log prediction, but the secret sauce is a smart ensemble method that dynamically promotes the best model during inference.
We compare lightweight automata-based models (n-grams) with neural architectures (LSTM, Transformer) for next-activity prediction in streaming event logs. Experiments on synthetic patterns and five real-world process mining datasets show that n-grams with appropriate context windows achieve comparable accuracy to neural models while requiring substantially fewer resources. Unlike windowed neural architectures, which show unstable performance patterns, n-grams provide stable and consistent accuracy. While we demonstrate that classical ensemble methods like voting improve n-gram performance, they require running many agents in parallel during inference, increasing memory consumption and latency. We propose an ensemble method, the promotion algorithm, that dynamically selects between two active models during inference, reducing overhead compared to classical voting schemes. On real-world datasets, these ensembles match or exceed the accuracy of non-windowed neural models with lower computational cost.