notebooks
notebooks copied to clipboard
Improve UMAP handling of sparsity
https://github.com/lmcinnes/umap/issues/115
Any reflections on the "sparcity regime" (term from linked thread) that you'd like to share or document for others thinking on this? Maybe worth itemizing the ways or reasons in which you understand data can be missing, for future thinkers on this
Suggested label: math
@colinmegill should this issue be in polisServer repo too?
I think this is the only issue discussing umap, so storing this here for future folks:
YouTube: Paper Review Call 019 - UMAP (Aug 2019)
Very good review, thanks @patcon ! To anyone finding this: read the paper first if you can and then watch the review, to get more out of it.
Highlights:
- Good discussion between 1st and 2nd parts of the review: https://youtu.be/G9s3cE8TNZo?t=3074
- Standalone intro to manifolds: https://youtu.be/G9s3cE8TNZo?t=5528
Link to one more UMAP issue: https://github.com/pol-is/polis/issues/591
Any reflections on the "sparcity regime" (term from linked thread) that you'd like to share or document for others thinking on this? Maybe worth itemizing the ways or reasons in which you understand data can be missing, for future thinkers on this
In a nutshell, for posterity: The participant-votes data is sparse because not all participants vote on all statements (i.e. lose interest or simply way too many -as in, 100s - statements ). Positive/negative/pass votes are +1/-1/0 respectively, but then these 'missing votes' cannot be 0. They are left blank, hence the sparsity. (Disclaimer: in practice, they can be considered 0, equaling them to pass votes, but that's another discussion..)
See also: https://compdemocracy.org/sparse-matrix/
On Sun, Jan 3, 2021 at 11:38 AM Giorgos Georgiadis [email protected] wrote:
Any reflections on the "sparcity regime" (term from linked thread) that you'd like to share or document for others thinking on this? Maybe worth itemizing the ways or reasons in which you understand data can be missing, for future thinkers on this
In a nutshell, for posterity: The participant-votes data is sparse because not all participants vote on all statements (i.e. lose interest or simply way too many -as in, 100s - statements ). Positive/negative/pass votes are +1/-1/0 respectively, but then these 'missing votes' cannot be 0. They are left blank, hence the sparsity. (Disclaimer: in practice, they can be considered 0, equaling them to pass votes, but that's another discussion..)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pol-is/notebooks/issues/2#issuecomment-753666244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANQGGI7MDWRDWZFFLBWDTTSYDBS3ANCNFSM4JL6ZGRQ .