Draft PR training
Hi @lescai
Overall it looks good.
I took your PR and made a new branch because I was doing some heavy repo modifications and I wasn't sure how it would work out.
I've been through the content and it looks good. You will notice some significant movement in the repo structure to accommodate your work. I added some stylistic changes (changes happening to all training on the site, e.g., explicit exercise blocks when we want the training participant to do something, block headings, and line numbers). I also added more information about how to open GitPod and how to get to the directory that we want them to be in as we can't assume a participant will know where to look for your material.
The only issue that needs to be resolved is the size of the fq and reference files. I kept them out of the repo for now because they more than double the repo size. Is there any way we can reduce these down while maintaining the real-life applicability? E.g., removing reads and/or fudging some of the reference files? Having these in the repo will load them by default when GitPod starts and I would like to keep it as light as possible so we don't have long wait times when building and environment.
Cc @mribeirodantas as I moved and updated the RNA-seq material you wrote as well
Deploy Preview for nextflow-training ready!
| Name | Link |
|---|---|
| Latest commit | 95d65fc1242a39bd0df959ec7f2ca83104fc9386 |
| Latest deploy log | https://app.netlify.com/sites/nextflow-training/deploys/65a8fa689d13e5000877d86e |
| Deploy Preview | https://deploy-preview-300--nextflow-training.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
Hi @christopher-hakkaart that's wonderful thanks for all the hard work you've done to accommodate the material within the new repo structure. About the inputs and reference files: these are also available through my teaching repos, and initially I did add a git clone to the gitpod startup script. One should consider how long it would take for the user to do that. The reference is located here https://github.com/lescai-teaching/datasets_reference_only (sequence and gatk bundle), and I believe it is also in the test data for Sarek (so might be available in the new test data location for nf-core as well). The reads have been simulated also for my teaching activities and are located here: https://github.com/lescai-teaching/dataset_exercise_resequencing Let me know how you would prefer to organise those, and I'd be happy to move/transfer as needed.
I'll get a second opinion about how to manage this. I'd like to have good practices now so if/when we have other trainings added in the future we have a procedure in place for how to deal with larger data files.
I think it's best to keep the gitpod as light as possible as most read and reference files won't be used by most users, so there isn't as much need for most users to load them in each environemnt.
If we can't find a solution quickly we might just point at your teaching resources as a quick fix, but commit to finding a long-term solution.