umap icon indicating copy to clipboard operation
umap copied to clipboard

Implement masking to control how embedded points are updated

Open matthieuheitz opened this issue 4 years ago • 8 comments

matthieuheitz avatar Mar 15 '21 17:03 matthieuheitz

Coverage Status

Coverage decreased (-1.7%) to 87.245% when pulling c35db5f584a92fac392502e58b5973ff073f0899 on matthieuheitz:fixedpoints into f86c922af62af35db53e0627d70eca5362c3fe7d on lmcinnes:master.

coveralls avatar Mar 24 '21 22:03 coveralls

Note that the remaining test failures are due to how coveralls plays with azure, so you can safely ignore them.

Another note: I may start a new 0.6dev branch and target this at that so we can merge it in sooner and work on it more easily without messing with the main branch. Let me know if you think that would be a good idea.

lmcinnes avatar Mar 29 '21 16:03 lmcinnes

I have now tested this, and I found out that I additionally needed to block this rescaling such that my points (which have coordinates slightly outside [0, 10]) stay fixed.

jondo avatar Mar 30 '21 17:03 jondo

Also, I would like to base my pinned initial embedding on the spectral embedding, and I suggest this change to get it.

Update: The suggested change was merged into master 👍

jondo avatar Mar 30 '21 18:03 jondo

Matthieu raised the rescaling issue with me elsewhere. It is a little tricky as the actual init does need to land in a reasonable spot, or the resulting embedding can go very badly. Leaving the rescaling in ensured that we had a sensible starting point. Otherwise there is the question of whether we leave it to the user -- it is not hard to accidentally provide a bad initialization that produces unexpected results and is hard to diagnose as to what is going wrong. I was hoping to avoid that if possible. Perhaps a reasonable option would be to come up with some semi-reasonable checks and warn if the provided initialization is troublesome?

lmcinnes avatar Mar 30 '21 19:03 lmcinnes

I also think that warning instead of rescaling is the way to go. Perhaps "one of the coordinate ranges is outside [8, 12] (i.e. more than 20 % off)" is a semi-reasonable condition? This assumes that the layout optimization is independent of absolute embedding location.

jondo avatar Mar 31 '21 09:03 jondo

A colleague working on an interactive visualization tool had an interesting scenario. A user could (say) inspect and then drag certain points to left/right and "pin" the x-axis of the dragged points while leaving other axes of the dragged points adaptable.

Supporting also a 2-D nsamples x dim pin_mask seems to describe the capability he was looking for. In this case, we want the behavior of gradient weights other_mask and current_mask to change from scalars into vector gradient multipliers.

kruus avatar May 04 '21 20:05 kruus

Just chiming in for if/when this is re-activated: pin_map feels most clear as an end-user who doesn't have deep knowledge of umap's internals. </ bikeshed> :)

patcon avatar Jun 26 '25 16:06 patcon