Search papers, labs, and topics across Lattice.
The paper addresses the bandwidth sensitivity of Mean-Shift clustering, especially in data-scarce scenarios leading to over-segmentation. They introduce Doubly Stochastic Mean-Shift (DSMS), which incorporates randomness in both trajectory updates and kernel bandwidth selection. DSMS demonstrates improved stability and prevents over-segmentation in sparse clustering, outperforming standard and stochastic Mean-Shift on synthetic Gaussian mixtures.
Randomizing the bandwidth in mean-shift clustering acts as an implicit regularizer, dramatically improving stability and preventing over-segmentation in sparse data regimes.
Standard Mean-Shift algorithms are notoriously sensitive to the bandwidth hyperparameter, particularly in data-scarce regimes where fixed-scale density estimation leads to fragmentation and spurious modes. In this paper, we propose Doubly Stochastic Mean-Shift (DSMS), a novel extension that introduces randomness not only in the trajectory updates but also in the kernel bandwidth itself. By drawing both the data samples and the radius from a continuous uniform distribution at each iteration, DSMS effectively performs a better exploration of the density landscape. We show that this randomized bandwidth policy acts as an implicit regularization mechanism, and provide convergence theoretical results. Comparative experiments on synthetic Gaussian mixtures reveal that DSMS significantly outperforms standard and stochastic Mean-Shift baselines, exhibiting remarkable stability and preventing over-segmentation in sparse clustering scenarios without other performance degradation.