Search papers, labs, and topics across Lattice.
This paper introduces "directional stability," a target-specific condition weaker than existing target-agnostic stability conditions, to enable efficient inference after adaptive data collection (e.g., bandit algorithms). Under directional stability, estimators that are efficient under i.i.d. data remain asymptotically normal and semiparametrically efficient when applied to adaptively collected data. The authors demonstrate that directional stability holds for LinUCB, providing the first semiparametric efficiency guarantee for regular scalar targets under LinUCB sampling.
Unlock asymptotically normal and semiparametrically efficient estimators in adaptive data collection by using a novel target-specific condition called "directional stability," which is weaker than previous target-agnostic conditions.
We study inference on scalar-valued pathwise differentiable targets after adaptive data collection, such as a bandit algorithm. We introduce a novel target-specific condition, directional stability, which is strictly weaker than previously imposed target-agnostic stability conditions. Under directional stability, we show that estimators that would have been efficient under i.i.d. data remain asymptotically normal and semiparametrically efficient when computed from adaptively collected trajectories. The canonical gradient has a martingale form, and directional stability guarantees stabilization of its predictable quadratic variation, enabling high-dimensional asymptotic normality. We characterize efficiency using a convolution theorem for the adaptive-data setting, and give a condition under which the one-step estimator attains the efficiency bound. We verify directional stability for LinUCB, yielding the first semiparametric efficiency guarantee for a regular scalar target under LinUCB sampling.