Search papers, labs, and topics across Lattice.
The paper introduces Uncertainty-Triggered Test-Time Selective Inference (UTTSI), a training-free, model-agnostic framework for scaling inference depth in CTR prediction models based on per-instance uncertainty. UTTSI uses a dual-signal estimator to identify epistemic uncertainty, triggering adaptive feature filtering and stochastic feature-path explorations for uncertain instances, while confident instances bypass exploration. Experiments across multiple datasets and architectures, including an online A/B test, demonstrate significant CTR gains with a controlled overhead of approximately 2.8x base model cost.
Get 5.3% more clicks by intelligently scaling your CTR model's inference depth only when it's uncertain, without retraining or increasing worst-case latency.
Scaling test-time compute has proven highly effective for language models, yet this opportunity remains largely unexplored for industrial Click-Through Rate (CTR) prediction. CTR models suffer from a fundamental asymmetry: feature combinations well-represented in training yield confident predictions, while sparsely observed ones produce unreliable outputs. Existing training-phase solutions such as adaptive gating learn a fixed selection function subject to the same sparsity, offering no per-instance recourse at deployment.We propose UTTSI (Uncertainty-Triggered Test-Time Selective Inference), a training-free model-agnostic framework that scales inference depth proportionally to per-instance uncertainty. A dual-signal estimator combining model logit confidence with a data-level frequency prior distinguishes epistemic uncertainty from aleatoric ambiguity. Every instance undergoes adaptive feature filtering to remove unreliable embeddings; uncertain instances additionally receive stochastic feature-path explorations whose predictions are aggregated via consistency-weighted ensembling. Confident instances bypass exploration entirely, keeping average overhead at approximately $2.8\times$ base model cost with worst-case latency unchanged.Experiments on four datasets with three backbone architectures demonstrate consistent, statistically significant gains over all training-phase baselines. A seven-day online A/B test further confirms a 5.3% relative CTR gain ($p < 0.01$), establishing selective test-time compute allocation as a practical complement to training-phase advances for CTR prediction.