Search papers, labs, and topics across Lattice.
This paper introduces a novel entropy-based estimator for quantifying the intrinsic accuracy limits of sequential recommender systems, addressing limitations of prior methods that are sensitive to candidate space specification and Fano's inequality. The estimator is training-free and candidate-size-agnostic, providing a more reliable assessment of data difficulty and headroom estimation. Experiments on synthetic and real-world datasets demonstrate its superior performance compared to existing methods, achieving high rank correlation with state-of-the-art recommender accuracy and enabling user-group diagnostics based on predictability.
Before sinking time into a new recommender model, know that this entropy-based estimator can predict its maximum achievable accuracy, revealing untapped potential or insurmountable data limitations.
Sequential recommender systems have achieved steady gains in offline accuracy, yet it remains unclear how close current models are to the intrinsic accuracy limit imposed by the data. A reliable, model-agnostic estimate of this ceiling would enable principled difficulty assessment and headroom estimation before costly model development. Existing predictability analyses typically combine entropy estimation with Fano's inequality inversion; however, in recommendation they are hindered by sensitivity to candidate-space specification and distortion from Fano-based scaling in low-predictability regimes. We develop an entropy-induced, training-free approach for quantifying accuracy limits in sequential recommendation, yielding a candidate-size-agnostic estimate. Experiments on controlled synthetic generators and diverse real-world benchmarks show that the estimator tracks oracle-controlled difficulty more faithfully than baselines, remains insensitive to candidate-set size, and achieves high rank consistency with best-achieved offline accuracy across state-of-the-art sequential recommenders (Spearman rho up to 0.914). It also supports user-group diagnostics by stratifying users by novelty preference, long-tail exposure, and activity, revealing systematic predictability differences. Furthermore, predictability can guide training data selection: training sets constructed from high-predictability users yield strong downstream performance under reduced data budgets. Overall, the proposed estimator provides a practical reference for assessing attainable accuracy limits, supporting user-group diagnostics, and informing data-centric decisions in sequential recommendation.