ByteDanceSDUJun 17, 2026arXiv:2606.18750

Ensuring Trustworthy Online A/B Testing: Addressing Five Key Questions on CUPED

Yu Zhang, Bokui Wan, Yongli Qin, Jinyong Ma, Yifan Guo

AI Summary

This paper addresses critical methodological nuances in the application of Controlled-experiment Using Pre-Experiment Data (CUPED) for A/B testing, focusing on five key questions that have been largely overlooked in the literature. Through comparative analysis of post-CUPED estimators and evaluation of regression-based adjustments, the authors reveal that naive variance estimation can lead to misleading inferences, particularly in complex experimental designs. The findings not only enhance the theoretical understanding of CUPED but also provide practical recommendations that have been successfully implemented in ByteDance's experimentation platform.

Key Contribution

Naive variance estimators in CUPED can lead to misleading conclusions in complex A/B testing scenarios, highlighting a critical gap in current methodologies.

Abstract

A/B testing has become the gold standard for data-driven decision-making in large-scale online experimentation, providing critical guidance for feature launch, pricing optimization, and user experience enhancement. To maximize statistical sensitivity, many technology companies routinely employ Controlled-experiment Using Pre-Experiment Data (CUPED), a technique that achieves substantial variance reduction while preserving the unbiasedness of estimating the average treatment effect. Despite its widespread adoption, several critical methodological and practical nuances of CUPED remain underexplored. This paper systematically addresses five frequently encountered yet overlooked questions regarding the application of CUPED. First, we provide a comparative analysis of various post-CUPED estimators to identify the optimal adjustment specification. Second, we evaluate the validity of regression-based adjustments and delineate robust variance estimation methods tailored for such frameworks. Finally, we extend our investigation to complex but common scenarios, including multi-arm experiments and two-stage sampling designs. Our findings reveal that in these settings, naive reliance on standard variance estimators can lead to severely misleading inferences. By offering rigorous theoretical insights and extensive experimental validation, this work deepens the conceptual understanding of CUPED. Notably, the recommended methodologies have been successfully deployed and integrated into ByteDance's experimentation platform.

Data Curation & Synthetic Data

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Ensuring Trustworthy Online A/B Testing: Addressing Five Key Questions on CUPED

Related Papers