docker-builds icon indicating copy to clipboard operation
docker-builds copied to clipboard

updating flye to 2.9.1

Open erinyoung opened this issue 3 years ago • 4 comments

There's a recent update to flye. According to the notes at https://github.com/fenderglass/Flye/releases/tag/2.9.1, this is a minor release that focuses on fixing bugs.

I took the old dockerfile and changed the following:

  • added --no-install-recommends to apt-get install
  • moved flye version to a variable
  • added a test
  • changed the base to ubuntu:bionic
  • added readme

Pull Request (PR) checklist:

  • [X] Include a description of what is in this pull request in this message.
  • [X] The dockerfile successfully builds to a test target for the user creating the PR. (i.e. docker build --tag samtools:1.15test --target test docker-builds/samtools/1.15 )
  • [X] Directory structure as name of the tool in lower case with special characters removed with a subdirectory of the version number (i.e. spades/3.12.0/Dockerfile)
  • [X] Create a simple container-specific README.md in the same directory as the Dockerfile (i.e. spades/3.12.0/README.md)
  • [X] Dockerfile includes the recommended LABELS
  • [X] Main README.md has been updated to include the tool and/or version of the dockerfile(s) in this PR
  • [X] Program_Licenses.md contains the tool(s) used in this PR and has been updated for any missing

erinyoung avatar Sep 08 '22 19:09 erinyoung

Also, the first iteration of the GHActions workflow failed, due to running longer than the limit of 360 minutes.

I've retriggered the workflow to run (maybe the test data will download faster this time?), but we might consider removing one of the test commands (probably just remove whichever dataset is larger & takes longer to download and run through flye)

EDIT: we could add in time in front of each of the wget & flye test commands and see which one takes longer

kapsakcj avatar Sep 09 '22 19:09 kapsakcj

I built this locally and it took 30 minutes to build through the test layer:

[redacted]
[2022-09-09 20:01:42] INFO: Final assembly: /data/out_nano/assembly.fasta
Removing intermediate container 6a6bb8ca775b
 ---> f3655f255ce6
Successfully built f3655f255ce6
Successfully tagged erin/flye:2.9.1

real    30m26.732s
user    0m1.566s
sys     0m2.145s

I'm unsure if docker build uses all CPUs available but anyways my machine has 8 CPUs and 32GB RAM.

I have a feeling that the GH actions workflow may fail again due to only having 2 CPUs

kapsakcj avatar Sep 09 '22 20:09 kapsakcj

I have an additional recommendation for the test layer. I believe flye comes with a toy dataset and you can invoke the tests with this command: python -c "import flye.tests.test_toy as t; t.main()"

Pulled it from here: https://github.com/bioconda/bioconda-recipes/blob/61f8d9e07af091415b30357eef9bcccf5351cd01/recipes/flye/meta.yaml#L34

It should run in less than a minute or so and may be more appropriate than assembling a whole E. coli genome

kapsakcj avatar Sep 09 '22 20:09 kapsakcj

alternatively this command probably does the same as the above command ^

python flye/tests/test_toy.py

https://github.com/fenderglass/Flye/blob/flye/docs/INSTALL.md#installing-from-source

Love it when developers provide test data and scripts ❤️

kapsakcj avatar Sep 09 '22 20:09 kapsakcj

I'm converting this to draft until I get the test thing figured out.

erinyoung avatar Sep 26 '22 17:09 erinyoung

I tried a few smaller genomes, but each option seemed like it would be too long. I've settled with using test_toy.py and I'm moving on with my life.

erinyoung avatar Oct 07 '22 18:10 erinyoung

This is now ready for review

erinyoung avatar Oct 07 '22 18:10 erinyoung