Leland McInnes comments

Results 487 comments of


                                            Leland McInnes

Transform gives very different results for the same data points if there are other rows or not

The transform is stochastic, so unfortunately there is not way to remedy this. If you really need a consistent (and presumably fast) transform I would recommend looking at the ParametricUMAP...

Transform gives very different results for the same data points if there are other rows or not

It handles things in batches, so permuting, or adding rows will change things. It shouldn't be *completely* different (that may be an issue), but it will certainly change.

Transform gives very different results for the same data points if there are other rows or not

Okay, that means there is likely something different happening in the nearest neighbor search, or, simply, the point is effectively torn between a few options -- i.e. it's embedding is...

Transform gives very different results for the same data points if there are other rows or not

A PR would be welcome.

Results of umap-learn and UMAP by cuML are different

UMAP as an algorithm is stochastic. As an optimization process it also does not differentiate between rotations or reflections of the same result. Running on many threads changes the stochasticity...

problem with import umap

It seems like numba added an __init__ file so structref now needs to be more explicit. As as immediate workaround you can change: ```python from numba.experimental import structref ``` to...

tighten test_umap_trustworthiness_random_init bound

Yes, this seems reasonable. Sorry for being slow to get to it, and thanks for the thorough analysis, it makes it a lot easier to just merge.

Evaluating dimensionality reduction?

Mostly I've been using trustworthiness over varying neighborhood sizes, which, I agree, is not tractable for large data sets. With enough compute time (and some judicious pre-compute and code tuning...

Evaluating dimensionality reduction?

I have tried UMAP on fashion MNIST. It does not magically separate the 10 classes (at least not as tidily as it does with digits), but the classes that it...

Evaluating dimensionality reduction?

I can dig up a visualization I think. It is closer to LargeVis in appearance, but still a little different. As to what to trust; I think you have to...