hdbscan icon indicating copy to clipboard operation
hdbscan copied to clipboard

A high performance implementation of HDBSCAN clustering.

Results 216 hdbscan issues
Sort by recently updated
recently updated
newest added

The in-built condensed_tree_.plot seems to be useful only with very simple clustering results because huge and complex trees are virtually illegible, and further, if using cluster_selection_epsilon, the condensed tree seems...

I got these two test failures during a test run, but they passed when I ran the tests a second time: ``` ________________________ test_hdbscan_boruvka_balltree _________________________ def test_hdbscan_boruvka_balltree(): labels, p, persist,...

I am implementing HDBSCAN to batches of data. I am concatenating the inputs in a list for each batch and fit them to HDBSCAN. This naturally would be slower as...

Hi, I am attempting to execute a stream that is using **HDBScan** clustering algorithm on a set of input data to generate a model. When I am selecting the Algorithm...

Hi, You have a really great repository ! After trying it successfully i need to apply hdbscan as part of cpp library - do you have some implementation in cpp...

I am struggling to install hdbscan into a virtual environment. I am trying to do this as part of a bertopic install - which is also failing when trying to...

When I try to set the `min_cluster_size` to 1, the hdbscan lib throws this exception: `ValueError: Min cluster size must be greater than one`. I do not understand why this...

I am curious if GLOSH implementation in this repository correctly follows the paper's definition of "outlierness". According to [HDBSCAN* paper](https://dl.acm.org/doi/10.1145/2733381) (R. J. G. B. Campello et al. 2015, page 25):...

Hi, First of all, thanks for the library I am really enjoying it. Sorry for the long thread, but I think this could be also informative for others on a...

Hi, I am using HDBSCAN to cluster text embeddings. As the data is unbalanced in favor of one category of embeddings, I am obtaining too many sub-clusters of that category,...