stat545 icon indicating copy to clipboard operation
stat545 copied to clipboard

bellybutton data: great example

Open jennybc opened this issue 6 years ago • 2 comments
trafficstars

Manually transferring from old STAT 545 website repo (https://github.com/STAT545-UBC/STAT545-UBC.github.io/issues/47).

Copied from what I posted in https://github.com/Reproducible-Science-Curriculum/rr-organization1/issues/41

Might make a good homework? For wrangling? Or data package creation?

Came to my attention via @zross and @Pakillo on twitter

Data on the biodiversity of belly buttons. You would get to say "belly button" a lot. And analyze innies vs outties.

http://navels.yourwildlife.org/bbb-project/results-and-data/

Basically they did lots of things right. It's a near miss. So fixing the problems is doable, would be very educational, and have a happy ending.

You could talk about

  • renaming these files consistently
  • depositing them somewhere more discoverable and persistent
  • making data available in a non-proprietary format (it's xlsx only)
  • within the xlsx (ok this heads into other areas, i.e. tidy data and spreadsheet hygiene)
    • there's gratuitous human-targeted annotation in the header row (screenshot below)
    • data stored in wide form, which is probably a good choice, but gives opportunity to discuss reshape after import
    • metadata in a second worksheet, which definitely makes sense, but gives opportunity to practice joins
    • human-targeted notes in a third worksheet which again makes sense, but gives opportunity to talk about what this would look like as, e.g. a git repository of a README plus 2 csv files and 1 or more R scripts
  • data was collected in two waves, so there are two xlsx files; I've only looked at one, but I would bet $ that there are some interesting issues w/r/t extracting data from both spreadsheets and unifying into one dataset
belly-button-header

jennybc avatar Sep 16 '19 23:09 jennybc

Of course the link above is dead but I have this: https://github.com/jennybc/bellybutton

jennybc avatar Sep 16 '19 23:09 jennybc

Just to finish the link circle: http://robdunnlab.com/projects/belly-button-biodiversity/

And don't worry, there is a petri dish portrait series: https://microbialart.tumblr.com/

apreshill avatar Sep 17 '19 00:09 apreshill