Search papers, labs, and topics across Lattice.
This paper investigates the minimal mean separation $\Delta$ required for partial recovery of clusters in a $K$-component Gaussian mixture model in $\mathbb{R}^d$ when $n \geq dK$. They establish a low-degree polynomial lower bound on $\Delta$ in this moderate-dimensional regime, demonstrating a "non-parametric rate" that differs from the high-dimensional case. The authors also present a novel, non-spectral algorithm that achieves this lower bound, improving upon existing polynomial-time procedures.
Clustering in moderate dimensions isn't just about dimension reduction; it hits a "non-parametric rate" computational barrier, revealed by a new low-degree polynomial lower bound.
We study the fundamental problem of clustering $n$ points into $K$ groups drawn from a mixture of isotropic Gaussians in $\mathbb{R}^d$. Specifically, we investigate the requisite minimal distance $\Delta$ between mean vectors to partially recover the underlying partition. While the minimax-optimal threshold for $\Delta$ is well-established, a significant gap exists between this information-theoretic limit and the performance of known polynomial-time procedures. Although this gap was recently characterized in the high-dimensional regime ($n \leq dK$), it remains largely unexplored in the moderate-dimensional regime ($n \geq dK$). In this manuscript, we address this regime by establishing a new low-degree polynomial lower bound for the moderate-dimensional case when $d \geq K$. We show that while the difficulty of clustering for $n \leq dK$ is primarily driven by dimension reduction and spectral methods, the moderate-dimensional regime involves more delicate phenomena leading to a"non-parametric rate". We provide a novel non-spectral algorithm matching this rate, shedding new light on the computational limits of the clustering problem in moderate dimension.