metric-learn
metric-learn copied to clipboard
Allow support for multi-label algorithms
Multi-labels problems with a lot of labels are a good use case of metric learning, so we could add support for it in the algorithms. In supervised ones it would mean modifying the loss function a bit (we have been discussing it with @bellet for NCA's PR in scikit-learn for instance) For weakly supervised ones it would mean make tuples from multi-labeled data (it seems that there are several strategies to do so, like how much labels do points share, etc...)
Could you share the link to NCA PR in scikit-learn? Are you reusing what’s available in metric-learn?
Could you share the link to NCA PR in scikit-learn?
Sure, here is the link: https://github.com/scikit-learn/scikit-learn/pull/10058
Are you reusing what’s available in metric-learn?
In fact I reused a lot of a PR about LMNN (https://github.com/scikit-learn/scikit-learn/pull/8602) for the architecture of the code, and just replaced the function with NCA's loss function. This PR is quite developed with respect to the error messages, the checks of the parameters, the automatic initialization, etc, so I guess we could get some of the developments from this PR in metric-learn (that's already what we did in some PRs like #113, #105, and #99)
Btw the PR has been merged in scikit-learn recently ! :tada:
Nice job! Congrats!
Thanks !
Also to mention that it was @GaelVaroquaux who originally suggested to investigate the multi-label setting ;-)
Looking forward to this development if it is still a thing