r-docker-tutorial
r-docker-tutorial copied to clipboard
section 06 minor comments
I really like the capstone in 06-sharing-all-your-analysis
- it's creating a picture of docker being this tool to freeze your code, data and deps in a little blob of amber, which I think could be a really compelling way to wrap up (and we might want to call that out more). A couple tweaks:
- I'd love to explicitly include some data as well, rather than the artificial teaching case of getting it from gapminder. Related:
- I'd also like to install something other than gapminder, to create a slightly more authentic experience.
- One pitfall that we don't deal with anywhere in this encased-in-amber approach to reproducibility is versioning. ie, what if I do exactly what this section says, then do something else and push it to my dockerhub repo, and then later you want to run my original version. There's got to be a way to roll back to earlier tags, and it can't be that tough (right? famous last words).
- challenge problems: does anyone have a super cool thing following this model we can get them to download and run / play with? Or something else showing off the stack in action.
There might be room for some context here, too; I think the encased-in-amber approach makes a lot of sense for reproducing that paper you wrote that time, but there's a whole other paradigm too - I use docker as a tool to quickly stand up analysis frameworks, that I might want to run on new data and new code, rather than on a fixed thing from history. As such, rather than baking everything in, it's interesting to think about infrastructure to integrate your docker container with your github repo, your figshare data,... Probably (way) too much for this lesson, but maybe in the supplementals or as a high-level, qualitative comment.
Hey Bill, do you have any suggestion for the type of data that you would like to see?