starterkit-lessons icon indicating copy to clipboard operation
starterkit-lessons copied to clipboard

Basics of analysis preservation

Open seneubert opened this issue 8 years ago • 3 comments

We would like to teach people from the beginning good practices that can make an analysis reproducible. What would be the minimum set of skills / tricks people would need to know? Here are some ideas:

  • basic snakemake intro
  • usage of the containerization template https://gitlab.cern.ch/lhcb-analysis-preservation/containerization-cookie (this template will receive updates before the next starterkit)
  • gitlab WG groups and eos WG space, usage of xrootd for file access

seneubert avatar Dec 02 '17 10:12 seneubert

I love the idea, although I would make it more about good practices and "common tools" (xrootd, eos spaces, etc) than specific tools such as snakemake. For those, I would certainly make a list and discuss them, but I would not lean towards one or the other (unless the AP group has converged on a single tool, of course).

apuignav avatar Dec 02 '17 10:12 apuignav

The point is to demonstrate automation and processing pipelines. Snakemake has gained most traction in the collaboration so far. CERN also had a very successful meeting with the Common-Workflow-Language people. I agree that the tool is less important and the pattern counts, but to make a hands-on lesson we need to choose an example. In the AP roadmap document snakemake is recommended.

seneubert avatar Dec 02 '17 16:12 seneubert

#86 adds a Snakemake lesson for this year's Impactkit. 🎉

chrisburr avatar May 07 '18 14:05 chrisburr