Leland McInnes
Leland McInnes
Here's what UMAP did:  As I said, more similar to LargeVis. It is worth noting that UMAP has kept some of the groups together where LargeVis split them into...
I tweaked the min_dist parameter (which defines how closely the embedding should pack points together in the embedded space) to compress things less (and hence resemble the t-SNE result more)...
I'm struggling to find time to short up all the math and get the preprint done (because I really want sound explanations of *why* things work, which means getting good...
There's something in https://github.com/lmcinnes/umap/blob/master/umap/validation.py and you can see https://github.com/scikit-learn/scikit-learn/blob/ccd3331f7eb3468ac96222dc5350e58c58ccba20/sklearn/manifold/t_sne.py#L394 for a (semi-canonical) implementation.
It is certainly not expected behaviour. I'll have to look into this a little and see if I can reproduce it to figure out why this is going astray (it...
@vseledkin is correct, the negative sampling is a method used to avoid ever actually computing the full loss value which is remarkably expensive. I did have a function that could...
Convergence is not checked - -instead the optimization is run fora specified number of epochs. On Thu, Aug 2, 2018 at 12:58 PM Artsiom wrote: > @lmcinnes but how can...
Thanks for the effort to track this down -- intermittent issues are the hardest to resolve. Hopefully a small consistent reproducer can be found.
The caching does not speed up the runtime performance, but it does speed up the import time performance -- there were complaints that pynndescent took too long to import, so...
I also had no real issues with the import time myself, but I think there are a number of use cases where shaving 5 seconds off the import time is...