Clustering.jl icon indicating copy to clipboard operation
Clustering.jl copied to clipboard

add HDBSCAN?

Open currymj opened this issue 6 years ago • 4 comments

DBSCAN is already included. There is a successor, hdbscan which has a famously good Python package, and is fairly popular.

DBSCAN is already here, and there are hierarchical clustering algorithms as well, so it's possible some code could be reused. There's a good explanation here of all the pieces of the algorithm.

I wish I were submitting a PR instead of just a feature request issue, but I still think a pure Julia implementation would be good to have.

Also, if anybody Googling for a Julia HDBSCAN implementation stumbles on this issue, you can just use PyCall.jl to call the hdbscan Python package. It works fine, just remember to transpose your data matrix because the Python convention is the opposite of Julia.

currymj avatar Jan 27 '19 11:01 currymj

I created a minimum-effort wrapper here https://github.com/baggepinnen/HDBSCAN.jl

baggepinnen avatar Oct 04 '19 07:10 baggepinnen

a Julia version is always the best, but thanks for the wrapper @baggepinnen .

babaq avatar Jun 09 '20 23:06 babaq

@MommaWatasu has coded this in pure julia here: https://github.com/MommaWatasu/HorseML.jl/blob/master/src/Clustering/HDBSCAN.jl

Data points are rows instead of columns.

It looks simple and clean, cant believe how much has been written for HorseML, dont know how it fits with rest of Clustering.jl api but I will try it out on my dataset now. Wondering how he would appreciate code reuse in Clustering.jl .

chelate avatar Mar 22 '24 09:03 chelate

I wrote the code only for learning and I haven't maintained it for long (since no one uses it). I created PR which contains my code from HorseML.jl. I hope my code is useful.

MommaWatasu avatar Mar 22 '24 16:03 MommaWatasu