umap icon indicating copy to clipboard operation
umap copied to clipboard

Weights when combining multiple UMAP models

Open candalfigomoro opened this issue 4 years ago • 2 comments

In this example https://github.com/lmcinnes/umap/issues/58#issuecomment-419682509 there was a mix_weight parameter to set a specific intersection weight.

When using the new * operator (https://umap-learn.readthedocs.io/en/latest/composing_models.html), is there a way to set a specific weight?

Could something like mapper1 * mapper2 * mapper1 give a higher weight to mapper1 (since we intersect it 2 times)? What is the proper way to do this?

Thank you :)

candalfigomoro avatar Feb 24 '21 13:02 candalfigomoro

Currently there is no proper way to do it -- the interface provides a quick and easy approach, but doesn't support a mix weight (the weighting is balanced between the two). It is tricky to have an API that would be both simple to use, and yet have enough flexibility. The right answer might be to add a separate compose method that can take a bunch of parameters such as the compose operator, mix weights, etc. Perhaps in an upcoming patch release (if the implementation turns out to be not too hard); perhaps in 0.6 (if things get messy). Worst case you can fall back to the approach outlined in the cited issue -- it should still work.

lmcinnes avatar Feb 24 '21 15:02 lmcinnes

I also have a similar question. I work on a dataset with about 82k observations and 140 features, of which only a few are numerical and the remainder are One-Hot-encoded variables.

I saw, that umap.umap_.general_simplicial_set_intersection included a weight parameter (https://antonsruberts.github.io/kproto-audience/) - is it then preferable to use "the old approach" for conducting intersections rather than the * operator?

RasGre avatar Mar 27 '23 11:03 RasGre