docker-introduction
docker-introduction copied to clipboard
More exercises
Met up with @aturner-epcc and @jcohen02 today and we discussed that the lesson needs more exercises. Adding this as a placeholder to look for places where we might add more exercises.
There are many exercises (target = biologists) in my Docker tutorial https://bcrf.biochem.wisc.edu/docker-beginner-for-biologists/
Part 1 - mostly similar to what is currently in the Carpentries lesson (hello-world etc.)
Part 2 - Sequence (pairwise) alignment (using EMBOSS) - note: requires TAG of the docker image
- understanding working
within
oroutside
of the container (blue or green colors for commands) - introduction to docker share
- EMBOSS run from within the container (blue)
- EMBOSS run from outside the container (green)
Part 3:
- Sequence alignment: multiple alignment with ClustalOmega (
clustalo
- issue solved by Entrypoint bypass) -
fastqc
: Quality control of Next Gen Sequencing data -
sra toolkit
: software to download archived Next Gen data from database Sequence Read Archives
Part 4: Graphical software - X11 and HTTP
- Web server
NGINX
- very simple static server and much less complex than Jekyll - Web-based : python-based analysis of data (e.g. iris) served to web
- Web-based: RStudio Server
- X11 - IGV (Integrated Genome Viewer) - Java-based software that displays to X11
- X11 : the fun tool of
xeyes
,xcalc
etc. - X11: EMBOSS interactive. (Finding open reading frame (ORF) and Protein Secondary Structure prediction)
@jsgro Thanks for sharing - will take a look at these
Please note that I wrote all of these exercises while at the same time trying to figuring out how things work. It is very likely that they could be "streamlined" or clarified by a more experienced user.
Just from the summary, I wonder if these exercises would be good for a domain specific "extra" or "add-on" episode.
Here is an exercise subject related to MS Excel spreadsheet that I am sure would me useful for a more "generic" type of exercise. "csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats." https://csvkit.readthedocs.io/en/latest/ which can be used with (or within) a docker version (hence no need to install.)
To remember what I did I also wrote a blog about this: https://bcrf.biochem.wisc.edu/2020/10/13/csvkit-command-line-spreadsheet-can-convert-and-compute-multiple-excel-files/
csvkit
is based on python 3.x but the end-used does not necessarily need to know that. The great thing here is that this is isolated in the container and therefore does not interfere with any local python.
Dealing with tabular data is rather universal. Just an idea. JYS