humap icon indicating copy to clipboard operation
humap copied to clipboard

Semi-supervised learning?

Open KnutJaegersberg opened this issue 3 years ago • 6 comments
trafficstars

Thanks for writing this awesome library, only recently discovered it. Do you have plans to support semi-supervised umap? From my first try outs of your library, this is the fastest (h)umap implementation which has nndescent. I would like to use it for semi-supervised learning, too.

KnutJaegersberg avatar Sep 20 '22 11:09 KnutJaegersberg

Hi! I definitely plan plans for a semi-supervision p! There's a bunch of other stuff in the backlog that I have to organize. I won't have time to work on it in the next few weeks... Maybe at the end of October. Let me know if you want to help me with this.

wilsonjr avatar Sep 20 '22 11:09 wilsonjr

I don't think I have the programming skills. I'm a useR reticulating around :) I think there is a bug in RStudio, I had to wrap the terminal, otherwise it crashed :P

KnutJaegersberg avatar Sep 20 '22 11:09 KnutJaegersberg

That's totally fine! Keep coming with the good suggestions though :)

wilsonjr avatar Sep 20 '22 11:09 wilsonjr

Sure, thanks for creating this great software. I'm on Manjaro. I had an install issue as well. I had the dependencies but could only install it after this command I found somewhere I can't remember. I'm not sure why. Somehow my OS did not know abut the Eigen lib, I guess.

sudo ln -s /usr/include/eigen3/Eigen /usr/include/Eigen

KnutJaegersberg avatar Sep 20 '22 12:09 KnutJaegersberg

In particular, I was trying semi-supervised learning with umap-learn with 200k records and a few hundred classes (clusters), but it did not finish the job in reasonable time. I'm hoping a c++ library could do that.

KnutJaegersberg avatar Sep 20 '22 13:09 KnutJaegersberg

Cool! I recently performed an experiment with 100k. It took about less than 2 minutes to embed. Let me know if you have any issues.

wilsonjr avatar Sep 20 '22 15:09 wilsonjr