TuringDataStories
TuringDataStories copied to clipboard
[WIP] EThOS PhD thesis metadata analysis
Summary
Copy the notebook from https://github.com/mhauru/EThOS-analysis/blob/master/analysis.ipynb, make minimal edits to make it run. This is a starting point of the PhD thesis metadata (EThOS) story.
List of changes proposed in this PR (pull-request)
- Lorem ipsum dolor sit amet, consectetur adipiscing.
- Lorem ipsum dolor sit amet, consectetur adipiscing.
What should a reviewer concentrate their feedback on?
- [ ] Lorem ipsum dolor sit amet, consectetur adipiscing.
- [ ] Everything looks ok?
Acknowledging contributors
- [ ] All contributors to this pull request are already named in the table of contributors in the README file.
- [ ] The following people should be added to the table of contributors in the README file:
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
View / edit / reply to this conversation on ReviewNB
mhauru commented on 2022-04-14T10:47:21Z ----------------------------------------------------------------
Camila: Rerun on latest data (https://bl.iro.bl.uk/concern/datasets/bb0b3ec4-4667-436a-8e6a-d2e8e5383726?locale=en)
View / edit / reply to this conversation on ReviewNB
mhauru commented on 2022-04-14T10:47:22Z ----------------------------------------------------------------
Camila: Check how many people have a full first name. If many have it, we could get gender data based on them, and analyse that.
View / edit / reply to this conversation on ReviewNB
mhauru commented on 2022-04-14T10:47:23Z ----------------------------------------------------------------
Camila: Check if qualification is always PhD.
View / edit / reply to this conversation on ReviewNB
mhauru commented on 2022-04-14T10:47:23Z ----------------------------------------------------------------
Camila/Markus: This plot needs improving. Some points, ideas:
- colours are ugly
- could bin by decade
- could rolling average by decade
- could colour Russell group unis, or Turing network unis
- the plot has some strange features that look like something is happening in the data, but it's more an artefact of the plotting
View / edit / reply to this conversation on ReviewNB
mhauru commented on 2022-04-14T10:47:24Z ----------------------------------------------------------------
Maybe put in one figure? Maybe rolling average to smooth?
View / edit / reply to this conversation on ReviewNB
mhauru commented on 2022-04-14T10:47:25Z ----------------------------------------------------------------
Keep buzzwords in this story, move all network analysis stuff to its own story.
View / edit / reply to this conversation on ReviewNB
mhauru commented on 2022-04-14T10:47:26Z ----------------------------------------------------------------
Try turning this into a figure.
View / edit / reply to this conversation on ReviewNB
mhauru commented on 2022-04-14T10:47:26Z ----------------------------------------------------------------
Present more nicely, less long list, more plotty
View / edit / reply to this conversation on ReviewNB
mhauru commented on 2022-04-14T10:47:27Z ----------------------------------------------------------------
This plot is hard to interpret.
@mhauru and @crangelsmith met on the 14th of April and decided to turn this notebook into two stories into two parts.
First part:
- Descriptive stats and figures.
- Add an analysis on gender if possible
- First pass a title analysis using word counts.
Second part:
- Title analysis using networks (second part of this notebook).
The comments above help guide the curation of the stories. We will open a new PR with the first part story and keep this one open until the second story is ready for review.
Todo list:
- [x] Update data to latest versions
- [ ] Do gender analysis
- [ ] Remove everything that will be in part 2
- [ ] Make plots prettier
- [ ] Go through review comments in ReviewNB
- [ ] Clean up imports (once part 2 is gone)