Search papers, labs, and topics across Lattice.
This paper establishes the consistency of the k-nearest neighbor (k-NN) regressor when applied to data obtained from complex survey designs, addressing a gap in the literature where existing consistency results primarily focus on i.i.d. data. The authors prove consistency under specific regularity conditions related to the sampling design and data distribution. They also derive convergence rate lower bounds, demonstrating that the curse of dimensionality persists even in complex survey settings, and validate their findings through simulations and real-world data experiments.
k-NN regression, a classic non-parametric method, can now be rigorously applied to complex survey data, expanding its applicability to a wider range of real-world statistical problems.
We study the consistency of the $k$-nearest neighbor regressor under complex survey designs. While consistency results for this algorithm are well established for independent and identically distributed data, corresponding results for complex survey data are lacking. We show that the $k$-nearest neighbor regressor is consistent under regularity conditions on the sampling design and the distribution of the data. We derive lower bounds for the rate of convergence and show that these bounds exhibit the curse of dimensionality, as in the independent and identically distributed setting. Empirical studies based on simulated and real data illustrate our theoretical findings.