metadat icon indicating copy to clipboard operation
metadat copied to clipboard

Code under 'Examples' in the help files

Open wviechtb opened this issue 1 year ago • 3 comments

Hi all,

I would like to start an open/transparent discussion about the following issue:

At the end of the help files (in the Examples section), there is often at times quite extensive code to illustrate some ways of analyzing the data or reproducing the results from the paper from which the data were taken. Essentially, this was a result of me moving the datasets that were originally part of the metafor package to metadat and hence copying the help files from one package to the other. As a result, the majority of help files illustrate the analysis of the data using the metafor package (given that about 3/4 of the datasets that are currently in metadat were contributed by me). The question has come up whether it should be possible to contribute code to illustrate the analysis of datasets using other packages (of course this would apply to any dataset, not just those that were contributed by me).

My approach so far has been this: The person who spends the time to extract, document, and contribute a dataset to the package (which often can take considerable amounts of time) is also the person who gets to write the Examples section (can think of this as a reward). This is why I have also never touched the Examples section on any of the datasets that were contributed to the package by other people. But given that many datasets were contributed by me, this might come across as me 'gate keeping' what packages are used to illustrate the analyses. Hence, I would like to solicit feedback on this issue in an open/transparent manner.

A few additional thoughts:

  • Additional example code is certainly useful for those who would like to see different packages/approaches for analyzing the same data.
  • If additional code can be contributed, who gets to decide what code is actually added?
  • What if somebody just wants to add 'boiler plate' analysis code to every single dataset?
  • Also, how would the code section be structured? For example, whose code is shown first? What if there is disagreement as to the appropriate way(s) of analyzing the data?
  • Is there a limit to the extent of the code? Is there a danger that the Examples section might become confusing/messy if there is too much code/output there?
  • Analysis code is wrapped in \dontrun{} because the package would already be rejected by CRAN due to excessively long run times on some of the examples. However, for creating the pkgdown docs (https://wviechtb.github.io/metadat/), I build the docs with pkgdown::build_site(run_dont_run = TRUE), which is why the actual results are shown there. What if code breaks? Who is then responsible for fixing things? As maintainer (and the person who currently builds these docs), it would then at least be my responsibility to email people about their code.
  • Right now, it takes about 15-20 minutes to build these docs. Can this start to get out of hand if a lot more code is added? Should there be a limit on the run times?
  • If we decide to stick to the 'the contributor gets to write the Examples section' approach: I see lots of datasets in other packages that are not part of metadat. Why not move those datasets over in which case one could balance out which packages are emphasized in the Examples section?
  • Another possibility: No code at all in the Examples section (but everybody is of course free to put code to illustrate the analysis of datasets on their own website or some other repository).

Hope to hear other people's thoughts on this!

wviechtb avatar Apr 04 '23 08:04 wviechtb