hierarchical-clustering-java
hierarchical-clustering-java copied to clipboard
cluster names are not reset for each independent clustering
I don't think globalIndex should be static in ClusterPair. It means that it just keeps getting incremented for every clustering that you do in a single instantiation of a JVM. This will lead to many problems such as:
- You will eventually get really large numbers in the cluster names if you run enough clusterings. I guess its not likely to overflow since its a long though.
- Unit tests are very hard to write because the results from one test are dependent on those that run before it. This should not be the case.
- Different results when running a clustering on the same data multiple times.
Instead the cluster indexes should only be global for a single clustering, not across all clusterings. I will try to make a PR.