notebooks icon indicating copy to clipboard operation
notebooks copied to clipboard

Improve UMAP handling of sparsity

Open colinmegill opened this issue 6 years ago • 6 comments

https://github.com/lmcinnes/umap/issues/115

colinmegill avatar Nov 12 '19 06:11 colinmegill

Any reflections on the "sparcity regime" (term from linked thread) that you'd like to share or document for others thinking on this? Maybe worth itemizing the ways or reasons in which you understand data can be missing, for future thinkers on this

Suggested label: math

patcon avatar Apr 23 '20 18:04 patcon

@colinmegill should this issue be in polisServer repo too?

patcon avatar May 10 '20 02:05 patcon

I think this is the only issue discussing umap, so storing this here for future folks:

YouTube: Paper Review Call 019 - UMAP (Aug 2019)

patcon avatar Dec 29 '20 05:12 patcon

Very good review, thanks @patcon ! To anyone finding this: read the paper first if you can and then watch the review, to get more out of it.

Highlights:

  • Good discussion between 1st and 2nd parts of the review: https://youtu.be/G9s3cE8TNZo?t=3074
  • Standalone intro to manifolds: https://youtu.be/G9s3cE8TNZo?t=5528

Link to one more UMAP issue: https://github.com/pol-is/polis/issues/591

ThenWho avatar Jan 03 '21 19:01 ThenWho

Any reflections on the "sparcity regime" (term from linked thread) that you'd like to share or document for others thinking on this? Maybe worth itemizing the ways or reasons in which you understand data can be missing, for future thinkers on this

In a nutshell, for posterity: The participant-votes data is sparse because not all participants vote on all statements (i.e. lose interest or simply way too many -as in, 100s - statements ). Positive/negative/pass votes are +1/-1/0 respectively, but then these 'missing votes' cannot be 0. They are left blank, hence the sparsity. (Disclaimer: in practice, they can be considered 0, equaling them to pass votes, but that's another discussion..)

ThenWho avatar Jan 03 '21 19:01 ThenWho

See also: https://compdemocracy.org/sparse-matrix/

On Sun, Jan 3, 2021 at 11:38 AM Giorgos Georgiadis [email protected] wrote:

Any reflections on the "sparcity regime" (term from linked thread) that you'd like to share or document for others thinking on this? Maybe worth itemizing the ways or reasons in which you understand data can be missing, for future thinkers on this

In a nutshell, for posterity: The participant-votes data is sparse because not all participants vote on all statements (i.e. lose interest or simply way too many -as in, 100s - statements ). Positive/negative/pass votes are +1/-1/0 respectively, but then these 'missing votes' cannot be 0. They are left blank, hence the sparsity. (Disclaimer: in practice, they can be considered 0, equaling them to pass votes, but that's another discussion..)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pol-is/notebooks/issues/2#issuecomment-753666244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANQGGI7MDWRDWZFFLBWDTTSYDBS3ANCNFSM4JL6ZGRQ .

colinmegill avatar Jan 03 '21 23:01 colinmegill