rust-kmedoids
rust-kmedoids copied to clipboard
Add CLARANS and FastCLARANS
As these methods require distance functions and a data matrix, it makes sense to first implement CLARA #5
CLARANS modifies PAM with a random sampling follows:
- choose a subset of medoid + non-medoid pairs uniformly
- compute the best swap in this subset only
- perform the best swap
hence it approximates PAM.
We can easily (c.f., the FastPAM papers) integrate the FastPAM optimizations:
FastCLARANS:
- choose a uniform subset of non-medoids
- compute the loss regarding all medoids in O(N) per non-medoid using the FastPAM optimizations
It makes less sense to also integrate the "Eager" optimizations - eventually we can just run FasterPAM on shuffled input with a limit on the number of distance computations to achieve a similar approximation. Hence I do not believe CLARANS and FastCLARANS are of much practical relevance anymore.
The FastCLARANS reference implementation in Java is in ELKI: FastCLARANS but this contains data structures optimized for Java that we can do differently in Rust without the same overhead.