Search papers, labs, and topics across Lattice.
The paper introduces STAP, a Transformer-based model for next app prediction that eliminates the need for a fixed app vocabulary by using randomly shuffled virtual indices to represent apps. STAP compensates for the loss of semantic information through an ultra-long context design, theoretically guaranteeing convergence to the correct distribution with sufficient context length. Experiments on two datasets demonstrate STAP's strong zero-shot cross-dataset prediction accuracy and competitive cold start performance, along with a deployment strategy for maintaining long context during continuous inference.
Shuffle-tokenization lets you train a single next-app predictor that generalizes across app ecosystems, even when you don't know the app names.
Predicting the next mobile application a user will launch is essential for intelligent device resource management and proactive assistance. Existing models rely on fixed app vocabularies, which prevents them from generalizing across different app ecosystems. Many also depend on user-specific knowledge, which complicates deployment in cold start scenarios. We propose STAP, a Transformer-based model that eliminates the need for a fixed vocabulary. STAP replaces true app identities with randomly reassigned virtual indices via a shuffle mechanism, and compensates for discarded semantic information by processing behavioral sequences with an ultra-long context design. A theoretical analysis shows that, given a sufficiently long context, the predicted distribution converges to the correct one despite the anonymity of the mapping. Experiments on two datasets from different continents demonstrate that STAP achieves strong cross-dataset zero-shot prediction accuracy -- a setting where all existing fixed-vocabulary methods are inherently inapplicable -- while its cold start performance within each dataset remains competitive with leading models. Furthermore, we introduce a deployment strategy that enables the model to retain a sufficiently long context during continuous inference while keeping latency within acceptable bounds.