docker-builds
docker-builds copied to clipboard
updating flye to 2.9.1
There's a recent update to flye. According to the notes at https://github.com/fenderglass/Flye/releases/tag/2.9.1, this is a minor release that focuses on fixing bugs.
I took the old dockerfile and changed the following:
- added
--no-install-recommendstoapt-get install - moved flye version to a variable
- added a test
- changed the base to ubuntu:bionic
- added readme
Pull Request (PR) checklist:
- [X] Include a description of what is in this pull request in this message.
- [X] The dockerfile successfully builds to a test target for the user creating the PR. (i.e.
docker build --tag samtools:1.15test --target test docker-builds/samtools/1.15) - [X] Directory structure as name of the tool in lower case with special characters removed with a subdirectory of the version number (i.e.
spades/3.12.0/Dockerfile) - [X] Create a simple container-specific README.md in the same directory as the Dockerfile (i.e.
spades/3.12.0/README.md) - [X] Dockerfile includes the recommended LABELS
- [X] Main README.md has been updated to include the tool and/or version of the dockerfile(s) in this PR
- [X] Program_Licenses.md contains the tool(s) used in this PR and has been updated for any missing
Also, the first iteration of the GHActions workflow failed, due to running longer than the limit of 360 minutes.
I've retriggered the workflow to run (maybe the test data will download faster this time?), but we might consider removing one of the test commands (probably just remove whichever dataset is larger & takes longer to download and run through flye)
EDIT: we could add in time in front of each of the wget & flye test commands and see which one takes longer
I built this locally and it took 30 minutes to build through the test layer:
[redacted]
[2022-09-09 20:01:42] INFO: Final assembly: /data/out_nano/assembly.fasta
Removing intermediate container 6a6bb8ca775b
---> f3655f255ce6
Successfully built f3655f255ce6
Successfully tagged erin/flye:2.9.1
real 30m26.732s
user 0m1.566s
sys 0m2.145s
I'm unsure if docker build uses all CPUs available but anyways my machine has 8 CPUs and 32GB RAM.
I have a feeling that the GH actions workflow may fail again due to only having 2 CPUs
I have an additional recommendation for the test layer. I believe flye comes with a toy dataset and you can invoke the tests with this command: python -c "import flye.tests.test_toy as t; t.main()"
Pulled it from here: https://github.com/bioconda/bioconda-recipes/blob/61f8d9e07af091415b30357eef9bcccf5351cd01/recipes/flye/meta.yaml#L34
It should run in less than a minute or so and may be more appropriate than assembling a whole E. coli genome
alternatively this command probably does the same as the above command ^
python flye/tests/test_toy.py
https://github.com/fenderglass/Flye/blob/flye/docs/INSTALL.md#installing-from-source
Love it when developers provide test data and scripts ❤️
I'm converting this to draft until I get the test thing figured out.
I tried a few smaller genomes, but each option seemed like it would be too long. I've settled with using test_toy.py and I'm moving on with my life.
This is now ready for review