jasyncapi icon indicating copy to clipboard operation
jasyncapi copied to clipboard

Optimizing reproducible research with R and related tools

Open benmarwick opened this issue 8 years ago • 9 comments

One of the great strengths of R is how it enables reproducible research. I'm interested in the use of R packages as research compendia to accompany published articles and reports. I'd love to learn more and see some demos of how people are using R and related tools (such as Docker and make) to simplify the reproducibility of their research, and find out where the pain points are for others.

benmarwick avatar Mar 12 '16 00:03 benmarwick

I'd love to learn more about using Docker and make to simplify reproducible research.

I can't remember where I read this but someone was saying that each paper or new method should have a shiny app alongside it to demonstrate the use of the tools. What do you think about that?

njtierney avatar Mar 12 '16 01:03 njtierney

Yes, an accompanying shiny app is an interesting proposal (maybe you saw this paper?). I guess if interaction and modification of plot parameters is a priority, then shiny would be a good option. Have you seen any good real-life examples of this? I guess my attention has been focused on reproducing the plots and numbers in the published article, since that's the most useful thing to me at the moment.

All I've seen are toy examples, such as this box-plot maker (1, 2). Perhaps a more substantial and useful effort might by a shiny app that generates the plots that Weissgerber et al recommend. These are quite hard to do in Excel or SPSS, etc. I've written R code for generating these kinds of plots.

That suggests a related challenge, a shiny app to do something very useful for non-R users, but very difficult to do well in Excel.

benmarwick avatar Mar 12 '16 05:03 benmarwick

I can't find the example, but I think it was a blog post or maybe even an a tweet saying something like:

"All packages and papers should have an accompanying shiny app with them"

It's great to know that this issue has been written about in journals, gives a good strong motivation for working on these projects.

From this I guess I see a few possible options for projects:

  1. Turn your R code reproducing figures from Weissgerber into a shiny app
  2. Shiny app that does tasks that are hard to do in Excel. (box plots, density plots?)

Perhaps we could also work on:

  • examples / tutorials of using Make with knitr
  • examples / tutorials of turning a paper into a shiny application.

njtierney avatar Mar 14 '16 03:03 njtierney

I've been using shiny apps to do analysis and produce figures for a paper we're working on and am interested in developing this idea further.

mattwatts avatar Mar 29 '16 21:03 mattwatts

I'm keen to work on this topic (well, one of several). It's rather timely since I'm working with a colleague to bring this idea forward in our field of plant pathology. There's a few of us using R and making our research reproducible, we'd like to see more people making an effort.

I've not really touched Shiny, but can certainly see the benefits and last night I was messing around with Docker to install a Linux instance to test R packages.

adamhsparks avatar Mar 30 '16 01:03 adamhsparks

Kaggle scripts is one example that demonstrates the potential of Docker. Kaggle has an identical Docker image that runs all their R scripts. With their Dockerfile (together with the data and scripts, perhaps shipped separately) one could reproduce any of the results.

Turns out Rocker already has an RStudio container. But I'm guessing different disciplines/use cases would require different setup/packages/tools. So perhaps one idea is to have a package that writes Dockerfile??

@benmarwick You mentioned the use of make.. Do you have in mind using make to automate the data analysis pipeline, as in here? Do you already know of any (other) examples?

ghost avatar Apr 05 '16 06:04 ghost

Yes, there are some nice examples of make for scientific research workflows by Karl Broman and Carl Boettiger. So far I've not used make myself, preferring to use only knitr for this purpose. That said, I'm quite interested in remake and look forward to trying that.

The dockertest package contains functions for generating Dockerfiles from R packages and other R projects. But I've not had any success with it, and have been writing my dockerfiles by hand (mine are pretty simple).

benmarwick avatar Apr 05 '16 07:04 benmarwick

This project got 6 votes at the AuUnconf. People that were interested in continuing discussions around this issue after the Unconf were: Jessie (me), Adam, Peter B, Miles.

jesse-jesse avatar Apr 28 '16 04:04 jesse-jesse

I'd like to be in that loop, please.

Andrew R.

mensurationist avatar Apr 28 '16 04:04 mensurationist