Leland McInnes
Leland McInnes
Both directions are stochastic. Fixing a random seed will merely make the *same* stochastic errors occur each time you do it. Exactly how big the error will be will depend...
This can be done in theory; in practice I am still working on the code to do this, so it isn't available in the repository yet. This may not be...
While Gower distance is quite useful it is also somewhat heuristic. I would recommend exploring it as *one of the options* for handling mixed continuous and categorical data.
One approach is ``pd.get_dummies``, but you may also want to look at the [dirty-cat](https://github.com/dirty-cat/dirty_cat) library for richer options.
Hmm, this is a new one. I'll have to look into this but I can't promise any swift resolution.
A number of pickling issues were resolved in pynndescent v0.5 and umap v0.5 now depends on that -- are you seeing the issue with the newer versions? If so it...
Is it specifically complaining about the ``collections.FlatTree``? If so I'll look and see if there is some reason that is still potentially causing issues.
Thanks @simon-slowik . It is particularly interesting that pinning to 0.5 works. I'll see if I can get this fixed.
The ``graph_`` attribute of a fitted model contains that information as a scipy sparse matrix that provides the weighted adjacency matrix of the graph.
What you really need is a fast approximate nearest neighbor computation. You can potentially use roaringbitmaps for that, and then UMAP does support being given the nearest neighbors (as ``knn_indices``...