Search papers, labs, and topics across Lattice.
This paper introduces ExDBSCAN, a post-hoc explanation method for DBSCAN clustering that generates counterfactual explanations for point assignments. ExDBSCAN leverages a density-connected weighted graph and a physics-inspired model to balance proximity and diversity when generating counterfactuals. Experiments on 30 tabular datasets demonstrate that ExDBSCAN outperforms baselines in terms of validity, diversity, and proximity of the generated counterfactuals.
Uncover the "why" behind DBSCAN assignments with counterfactual explanations that reveal how small data changes can flip a point from inlier to outlier.
Clustering is an unsupervised technique for grouping data points by similarity. While explainability methods exist for supervised machine learning, they are not directly applicable to clustering, making it challenging to understand cluster assignments. This interpretability gap is particularly evident in the popular density-based method DBSCAN, which assigns points as inliers (cluster members in dense regions) or outliers (noise points in sparse regions). DBSCAN does not provide insight into why a particular point receives its assignment or whether its assignment is robust to small changes in the data. To address the lack of explainability, we introduce ExDBSCAN, a density-aware, post-hoc explanation method. ExDBSCAN offers actionable counterfactual explanations, with theoretical guarantees for validity. It generates multiple counterfactuals using a density connected weighted graph, adopting a physics-inspired model that repels counterfactual candidates from one another (diversity), while pulling them toward the instance to explain (proximity). Empirical evaluation on 30 tabular datasets comparing against four baselines shows that ExDBSCAN outperforms all baselines while attaining perfect validity and retrieving diverse, proximal counterfactuals.