yellowbrick
yellowbrick copied to clipboard
Density based clustering validity measures
Clustering scores like silhouette work well for K-Means but make less sense for density based clustering techniques like DBSCAN which support arbitrary cluster shapes. It would be nice to include scores and visualisation for measures that support density based notions of clustering. These are a a little thin on the ground, but the Density Based Cluster Validity Index of Moulavi et al (http://www.dbs.ifi.lmu.de/~zimek/publications/SDM2014/DBCV.pdf) is one of the better ones.
Absolutely that would be awesome -- we had to add our own distortion score metric, would you be willing to write up some Python to compute the cluster validity index? Check out distortion_score for signature and input.
I have some code for it here. It has some dependency on hdbscan, but in practice that amounts to the mst_linkage_core, which you can replace with any suitable minimum spanning tree code.