ClusterAnalysis.jl
ClusterAnalysis.jl copied to clipboard
Create Silhouette coefficient for KMeans clustering.
PR for calculating Silhouette coefficient for KMeans clustering by ClusterAnalysis.jl. In this PR, I propose two ways of calculating the metric. A primary way (which has even more computation cost than KMeans) and a Simplified version for big datasets. You can determine when to switch to the Simplified version by changing 10^3 in if big(size(data, 1))^2 > 10^3 based on the Silouhette function.
Also, I brought a little benchmarking here:
https://discourse.julialang.org/t/silhouette-coefficient-calculation/89861/6?u=shayan