Leland McInnes
Leland McInnes
A yes, I see. The 1D data is going to potentially be an issue. I'll add a test for 1D data and see if I can wring out the last...
With regard to the changes you are proposing here. They seem to make sense, but I admit (without line references) I don't entirely follow exactly what you are proposing. I...
I'll try to take a look when I get some time. I have also had a chat with Matt Rocklin about a distributed pynndescent on dask, and he had some...
The parallelism has since been reworked and runs via numba now. With that said a dask distributed version of pynndescent would be an awesome thing to have -- allowing users...
Hi @jamestwebber, and thanks for being willing to look into this. The threaded rp-trees remain around because it was a template of a plan for how to handle rp-trees if...
you can reach me at leland DOT mcinnes AT gmail DOT com
I haven't looked in depth, but it seems like it is a nice wrapper that provides sklearn-alike classes for many common sklearn tasks (so you catch essentially just use them...
Thanks for this. Challenging datasets are always interesting, and I will have to explore exactly what makes nearest neighbors so hard with this data. It does seem remarkable that it...
Thanks for the references. I recall we had some concerns about hubness, but never looked into it much. I will have to look through the referenced paper to see if...
That sounds good -- if we can spot hub nodes (and max_candidates should be higher than 5 almost always) then some mitigation might be easily tractable. I look forward to...