Leland McInnes

Results 487 comments of Leland McInnes

Thank you for the kind words. Mostly what UMAP will buy you over using DBSCAN directly on the embedding vectors is a lot more of your data clustered while still...

I think for that use case you might want to look into ParametricUMAP. UMAP does have an ``update`` method, but it is definitely not the same as training on the...

This seems to break the boruvka KDTrees, which don't seem to support taking a p value. You may need a further workaround (use ball trees) in that case.

Soon; conda-forge monitors pypi, so a rebuild of the feedstock should get triggered. Leland. On Fri, May 24, 2024 at 3:41 PM Divye Gala ***@***.***> wrote: > I see that...

Not currently no, and is unlikely to do so in the future. Eventually the dual tree boruvka should be ported into scikit-learn, and I have a more modern (but at...

The dual tree Boruvka versions that are here (and make it much faster for large datasets) have not been included yet, so this repo still represents a useful library for...

The numpy 2.0 release might mean you have to ensure cython gets run to rebuild C sources. Potentially this can be solved the ``--no-binary`` option in pip install, but I'm...

It looks like numpy 2.0 is not going to play nice with cython and hdbscan here. Figuring out how to make it all work will take some time. In the...

You can try the soft clustering options: https://hdbscan.readthedocs.io/en/latest/soft_clustering.html but there really isn't a magical straightforward way to do this. On Sun, Jun 9, 2024 at 10:23 PM Asquator ***@***.***> wrote:...

HDBSCAN should not really be taking that long, especially the GPU implementation, if you are reducing to a reasonable number of dimensions. So something odd is going on there.