ParallelKMeans.jl icon indicating copy to clipboard operation
ParallelKMeans.jl copied to clipboard

Parallel & lightning fast implementation of available classic and contemporary variants of the KMeans clustering algorithm

Results 16 ParallelKMeans.jl issues
Sort by recently updated
recently updated
newest added

As far as I can tell, only `kmeans++` is currently implemented. Looking at https://www.mdpi.com/1999-4893/14/1/6 "Improving Scalable K-Means++" it looks like `SRPK-means‖` could be a good method to have available :slightly_smiling_face:.

I have a working implementation of the lightweight coresets paper (https://las.inf.ethz.ch/files/bachem18scalable.pdf) in Julia. It's not distributed yet (I only have one machine to run it on anyway) but if you...

Currently, `YinYang` can work only with euclidean metric, since it's main niternal functions rely heavily on exact form of metric calculation. Algorithm should be generalized (everywhere, where you see `sqrt`...

We have lots of manual unpack, it would be nice to switch to nice Unpack.jl library, of course after thorough benchmarking.

Currently we are implementing only `SqEucledian` metric, but we can add support for all other metrics in `Distances` in the same manner as it is done in https://github.com/JuliaStats/Distances.jl/blob/master/src/generic.jl#L45 We should...

enhancement

It would be great to provide an interface for researchers to cite this project. [Zenodo](https://zenodo.org/) seems like a good choice but other alternatives should be explored as well.

enhancement

Some refactoring is needed and we are more or less ready for this changes. I put them all here together, but they can be split later into separate issues. -...

enhancement

As a future step after the implementation of point-wise parallel computations, it would make sense to improve algorithm by using "Fast kmeans" techniques. Several approaches exists, here is some inspirational...

enhancement