unconf15 icon indicating copy to clipboard operation
unconf15 copied to clipboard

Gender proportions in the R community

Open benmarwick opened this issue 10 years ago • 3 comments

This is more of a data science project than a software development issue, but rOpenSci is notable for promoting gender equity, so it seems relevant. The goal would be to identify specific areas in the R user and developer communities to promote gender diversity and counter bias. The general approach would be to use the gender package to analyse various data sources on contributions to the R community to investigate patterns in proportions of female to male contributors. This is currently a topical issue, as we can see from a summary of recent conversations:

  • there was a lively r-help thread on this last year,
  • a related bit of activity of twitter about gender ratios on r-help
  • there was a "Conversation about R’s Gender Gap: The useR! her Panel" at the 2014 UseR meeting
  • @jennybc, in conversation with @hadley, @earino and @ledell, noted on twitter that there is only one female name in the 90 contributors listed on the R Project Contributors page
  • more recently @dicook has done some work with her students on this, and I've made a sketch
  • probably others I don't know about, eg. Hadley says 'we're aware of the problem and there are plans to fix it'

Here are a few specific questions that could be considered:

  • It's a safe bet that males are a higher proportion overall, but is this changing over time? For example, what are the proportions among package maintainers (9-15% female by some estimates, and a slight but non-significant decline in recent years)? Those data are nicely structured and easy to access, for example at http://www.rdocumentation.org/
  • Is the proportion of female R developers high enough to make a case that R core should include at least one female? Who do the data suggest this should be?
  • Are there some areas of R development where females have greater representation? For example are there more female package maintainers on github or bioconductor than CRAN?
  • Is there are difference in the way bug reports from male and female posted to the R-base bugtracker are handled? Do they equally lead to fixes?
  • Is there are difference in name order for males and females amongst packages with multiple maintainers?

benmarwick avatar Feb 19 '15 08:02 benmarwick

You could also take a look through mailing archives to estimate hostility quotients and the like for discussions, because that turns off new contributors (including women) and sets the tone for the whole project. Might get a bit personal, but could still be valuable; and could extend well beyond the R community in impact.

ctb avatar Feb 28 '15 14:02 ctb

Yes, good idea, reminds me of @treycausey's interesting analysis of r-help:
http://badhessian.org/2013/04/has-r-help-gotten-meaner-over-time-and-what-does-mancur-olson-have-to-say-about-it/

benmarwick avatar Feb 28 '15 16:02 benmarwick

I wonder if the ggplot2 mailing list might make a good comparison, because I think there are a few women that regularly answer questions there. And I think it is politer. I may be wrong.

From the little black box...

On Feb 28, 2015, at 10:27 AM, Ben Marwick [email protected] wrote:

Yes, good idea, reminds me of @treycausey's interesting analysis of r-help:

http://badhessian.org/2013/04/has-r-help-gotten-meaner-over-time-and-what-does-mancur-olson-have-to-say-about-it/

— Reply to this email directly or view it on GitHub.

dicook avatar Mar 03 '15 15:03 dicook