Import Repness code from notebooks
This is the computation of representativeness from the Jupyter notebook available at https://github.com/compdemocracy/analysis/blob/master/notebooks/jupyter/american-assembly-representative-groups-and-comments.ipynb
At @colinmegill's request, I do not port the Clojure code which is taking too long, and instead import that python code from the notebooks.
I have:
- replaced the CSV loading by querying the database
- replaced the PCA+Kmeans from the notebook by loading the group info from the math json blob in the database
- removed the limit to 5 comments per repness and the "agree only" filtering (which is different from clojure code, see below) so we can have the repness score for every comment and every group and every vote, which is the goal
- replaced one function (keeping only participants with a certain number of votes) by a faster implementation (shaving 7 seconds on our largest convo)
- wrap in a web service that typescript can call
- and also a command line client that is useful for debugging
What is still missing:
- it is not using a nice pyproject.toml.
- it is not wrapped yet in a Docker file and docker-compose. @colinmegill : let me know if you want me to do that, or if you prefer that @ballPointPenguin or @tevko does since they see the full architecture, in particular the CDK rewrite.
- there is a bug on one of our largest ongoing conversations (M2).
Disclaimer:
- I have not checked the math, I have not compared its output to the clojure code, as I was asked for a 2 day job of wrapping this notebook.
- There is no guarantee that the notebook implements exactly the same logic as the existing polis math server. I suspect it does not, and I remember @colinmegill a few years ago telling me it doesn't.
- I have observed that the selection algorithm from the notebook seems quite simpler than the one in clojure.
- I have observed that the groups computed by the notebook, in the one case where I have compared them, are very different from those extracted from the database and those stored in the CSV file.
Thanks @michielbakker ! Agreed. I've added a simple test for the shapes with some specific values, and will finish later today adding the actual check that the p-values match up.
Yeaaah, so, I'm adding the p-values check, and I'm getting some "p-values" > 1 😆 That's because , in the notebook code, the actual p-values get multiplied by $R_v(g,c)$ before being returned, whereas the paper Small et al. (2022) state in Section 1.4 page 8 (emphasis added):
The selection criterion for which comments are to be shown involves looking at the two-property test (in essence, the Fisher exact test; Fisher, 1922). The corresponding Fisher Z-statistic is multiplied by 𝑅_𝑣(𝑔, 𝑐) to reflect both the estimated effect size and the statistical confidence associated with the effect.
The z-statistic is not the p-value. It's the computed statistic, that what we inject in the cumulative distribution function of the test statistic to compute the p-value. I'm not sure whether multiplying the Z-statistic by R is correct either, to be honest, given @colinmegill 's offline warning that there could be discrepancies between the paper and the clojure code.
@colinmegill , what do you prefer:
- Option A: do you want me to dive back into the Clojure code to compare the behaviour?
- Option B: we merge this as-is and consider that we have a metric of sorts?
- Option C: ... something else ?
As you know my first instinct had been to compare the output of the notebook computation (or that of my own python port I was doing in #1893 ...) to the output of the Clojure code, but that does take time and you did point quite clearly this was taking me too long. Please let me know what you want me to do at this stage.
For reference, we also have another bug in this code from the notebook: it crashes on BG2050 code, possibly due to how it handles the moderated comments.
Bias towards shipping! Otherwise we risk fixing bugs in the clojure as a gate on new features. I would rather discover through iteration, if we're confident the metric is usable and in the right direction!
As I mentioned, I do not vouch for the math nor the implementation. But it's a number. It computes something, that is technically shippable if we put a Docker container around it.
If you want that this formula in the paper be properly understood, and a proper guarantee that the code here implements the formula in the paper, that takes more time. And that's assuming we take the paper as the "ground truth" that we should be doing. The only reference implementation that we have, of that paper, I.e. that has all the details in code if not in math, is the clojure code.
looks great @jucor , happy to dockerize and integrate into AWS arch when ready
OK, thanks everyone! From the feedback here then @tevko @colinmegill, ready for dockerizing and merging when you see fit, then. ~Meanwhile I'll prepare another PR to check the p-values / article (in particular whether the odds-ratio should be before or after the significance test), and another to make the moderation filter work with BG2050.~ [edit: other developments came up, I will not be working on those follow-up.]
Closing as obsolete.