Leland McInnes comments

Results 492 comments of


                                            Leland McInnes

Evaluating dimensionality reduction?

Here's what UMAP did: ![image](https://user-images.githubusercontent.com/11962885/31576785-202ef6e4-b0fa-11e7-9b03-c84ed56c9e92.png) As I said, more similar to LargeVis. It is worth noting that UMAP has kept some of the groups together where LargeVis split them into...

Evaluating dimensionality reduction?

I tweaked the min_dist parameter (which defines how closely the embedding should pack points together in the embedded space) to compress things less (and hence resemble the t-SNE result more)...

Evaluating dimensionality reduction?

I'm struggling to find time to short up all the math and get the preprint done (because I really want sound explanations of *why* things work, which means getting good...

Evaluating dimensionality reduction?

There's something in https://github.com/lmcinnes/umap/blob/master/umap/validation.py and you can see https://github.com/scikit-learn/scikit-learn/blob/ccd3331f7eb3468ac96222dc5350e58c58ccba20/sklearn/manifold/t_sne.py#L394 for a (semi-canonical) implementation.

update function of UMAP does not work

It is certainly not expected behaviour. I'll have to look into this a little and see if I can reproduce it to figure out why this is going astray (it...

How to look at the loss value?

@vseledkin is correct, the negative sampling is a method used to avoid ever actually computing the full loss value which is remarkably expensive. I did have a function that could...

How to look at the loss value?

Convergence is not checked - -instead the optimization is run fora specified number of epochs. On Thu, Aug 2, 2018 at 12:58 PM Artsiom wrote: > @lmcinnes but how can...

UMAP Segmentation Faults

Thanks for the effort to track this down -- intermittent issues are the hardest to resolve. Hopefully a small consistent reproducer can be found.

UMAP Segmentation Faults

The caching does not speed up the runtime performance, but it does speed up the import time performance -- there were complaints that pynndescent took too long to import, so...

UMAP Segmentation Faults

I also had no real issues with the import time myself, but I think there are a number of use cases where shaving 5 seconds off the import time is...