wrangling-genomics
wrangling-genomics copied to clipboard
should we make the fastq download links and executable file?
Currently, the episode 02 Assessing Read Quality is written for the learners to copy and paste the links into the terminal, which downloads the fastq files that we work with in the rest of the episodes.
Instead, we could write an executable script that downloads these files. This would still have a copy-paste component, but would record the commands that they run to get the files.
@svigneau came up with this idea!
Thoughts?
I definitely see the point of having a script. However, I believe that c&p each and every command would make the later point of the usefulness of automation and scripts much stronger.
Also, having a script means that we'll need to add a bit more of a cognitive load here, such as explaining permissions and/or showing how to list the contents of the script. It's not a huge one, for sure, but personally I'd keep it as is.
That being said, I won't argue against it if people are generally in favor. :)
@fpsom that is true! In 2019-02-04-boston workshop, I don't think the cognitive load would be such a big problem, as this is covered pretty heavily in shell-genomics
. However, doing c&p does motivate well for automation later, and adding scripting would also add time to an already long lesson :)
AZ bbq: It is bad practice to download these with curl. If learners try this later with many files they will get blacklisted. We recommend they use the available files on the machines.
An alternative that shows people how to import fils would be a place to show folks the Ensembl website, how to find an ftp link to the small Ecoli reference genome we need later.
@sstevens2 I disagree...downloading the files this way makes the workflow fully repeatable outside of the curated instance, and demonstrates how learners would download information from the internet. However, I also understand how this is not a scalable approach for more that ~10 files.
@JasonJWilliamsNY I like that approach. This way the learners still see how to find publicly available data and put it on to a computer.