rnaseq-pipeline
rnaseq-pipeline copied to clipboard
Add a task to generate a summary of the tools, software and references used
I thought of first serializing the whole workflow with CWL, but that is definitely overkill and really challenging to achieve with our dynamic dependencies. I'm not against reusing the definitions in CWL though where useful.
- software versions (git commit for contrib/RSEM)
- reference metadata (add a new YAML format to drop under genomes/<reference_id>/metadata.yaml)
- pipeline version & Python dependencies
- full CLI arguments used at each step
- snapshot of the Conda environment
VERY IMPORTANT: version the metadata format itself! and make it backward-compatible.