hdbscan
hdbscan copied to clipboard
A high performance implementation of HDBSCAN clustering.
The in-built condensed_tree_.plot seems to be useful only with very simple clustering results because huge and complex trees are virtually illegible, and further, if using cluster_selection_epsilon, the condensed tree seems...
I got these two test failures during a test run, but they passed when I ran the tests a second time: ``` ________________________ test_hdbscan_boruvka_balltree _________________________ def test_hdbscan_boruvka_balltree(): labels, p, persist,...
I am implementing HDBSCAN to batches of data. I am concatenating the inputs in a list for each batch and fit them to HDBSCAN. This naturally would be slower as...
Hi, I am attempting to execute a stream that is using **HDBScan** clustering algorithm on a set of input data to generate a model. When I am selecting the Algorithm...
Hi, You have a really great repository ! After trying it successfully i need to apply hdbscan as part of cpp library - do you have some implementation in cpp...
I am struggling to install hdbscan into a virtual environment. I am trying to do this as part of a bertopic install - which is also failing when trying to...
When I try to set the `min_cluster_size` to 1, the hdbscan lib throws this exception: `ValueError: Min cluster size must be greater than one`. I do not understand why this...
I am curious if GLOSH implementation in this repository correctly follows the paper's definition of "outlierness". According to [HDBSCAN* paper](https://dl.acm.org/doi/10.1145/2733381) (R. J. G. B. Campello et al. 2015, page 25):...
Hi, First of all, thanks for the library I am really enjoying it. Sorry for the long thread, but I think this could be also informative for others on a...
Hi, I am using HDBSCAN to cluster text embeddings. As the data is unbalanced in favor of one category of embeddings, I am obtaining too many sub-clusters of that category,...