snakemake-novice-bioinformatics icon indicating copy to clipboard operation
snakemake-novice-bioinformatics copied to clipboard

Several concerns on Ep 11 (new assembly workflow)

Open tbooth opened this issue 5 months ago • 3 comments

From review by @cmeesters:

Chapter 11 - Designing a new workflow This Chapter needs a major revision:

  • the assembly part comes out of the blue and is unrelated to everything before. If you want it, you need additional material, describing the background. Best put it into a separate chapter (or several), then.
  • genome assembly is an intricate challenge, recommending a relatively outdated tool like velvet is dangerous, as there are numerous follow-up implementation tailored for various genome types.
  • the design phase is ok, but does not mention the template from the Snakemake workflow catalogue. However, standardizing and contributing(!) a workflow has an enormous impact on the deployment and portability of workflows. And thereby on the whole ecosystem of Snakemake. Not to mention, the catalogue and how to contribute to it is a major flaw.
  • for the whole community it would be better, if people do not re-invent the wheel (e.g. new workflows for existing solutions), but were able to contribute to existing workflows and fix issues. This, however, requires a bit more documentation in Snakemake. A basic intro to git (pull, fork, commit, create PRs) might be helpful - and beyond the scope of this intro. Yet, perhaps a pointer to the catalogue and snakedeploy might be a good idea after all.
  • the separation of workflow and data is not taught (unless overlooked by me). Please introduce the --directory flag and the recommendation to separate workflow and data, which enables new users to apply the workflow onto several different datasets.

tbooth avatar Aug 29 '24 14:08 tbooth