rnaseq icon indicating copy to clipboard operation
rnaseq copied to clipboard

Add Rmarkdown module(s) for basic reproducible data analysis

Open lescai opened this issue 3 years ago • 2 comments

Following up on a thread on Slack in #modules, discussing with @drpatelh and @grst:

  • integration of Rmarkdown with Nextflow
  • adding Rmarkdown at the end of nf-core pipelines, to generate a minimal and general-use reproducible analysis report

RNAseq appears a workflow where basic analyses might be consolidated, and a good starting point. This issue is meant to help brainstorming on the topic and identifying:

  • benefits of adding this feature
  • which analyses might be considered of general utility and therefore worth adding into the pipeline

lescai avatar Oct 26 '20 15:10 lescai

Ideas of analysis steps that could benefit from Rmarkdown

  • Now that we have a sample-sheet, could we do an automated DE-analysis using e.g. edgeR or DESeq2?
  • This is not a very commonly done, but for human and mouse samples I perform BioQC to check for tissue contaminations.

On the other hand, I feel these should rather become separate modules/processes and be integrated in the MultiQC-report. Tbh, I'm not so convinced the RNA-seq pipeline is the right place for a Rmarkdown-based analysis, as it's mostly about preprocessing and not a data analysis.

grst avatar Oct 27 '20 07:10 grst

I can see that a DEseq analysis has been also extensively discussed in #409 Maybe we could combine all thoughts together? I do agree on the preprocessing: but most likely we will face this issue in most pipelines, because it will be very difficult to generalise enough an analysis template in order to capture all potential sample designs, specific questions, elements to be explored. This is something most scientists would like to decide in (a) experimental design and (b) after a first round of data exploring (preprocessing?). In our case at NIBSC, we find however that a relatively standard preprocessing takes away a first computationally intensive phase of data analysis, and help scientists focus additional analytical steps where it's needed.

lescai avatar Oct 27 '20 08:10 lescai

This is now covered by the differential abundance workflow (which may need renaming at some point, since it does exploratory thing too). That workflow does indeed produce a markdown document as an output (as well as rendered HTML).

pinin4fjords avatar Nov 10 '23 09:11 pinin4fjords