modules icon indicating copy to clipboard operation
modules copied to clipboard

Add CI to build mulled containers with Wave

Open edmundmiller opened this issue 2 years ago • 30 comments

Closes #4077

edmundmiller avatar Oct 17 '23 12:10 edmundmiller

Why don't we do that already?

maxulysse avatar Oct 18 '23 07:10 maxulysse

Why don't we do that already?

Just hadn't gotten around to it 😆

I'm also having trouble authenticating with the registry. I didn't think it would have to go through tower if it was public. @pditommaso could you confirm?

I think if not we can use Tower to build the images in CI using nextflow inspect if the Dockerfile or environment.yml changes and push the image for people.

edmundmiller avatar Oct 18 '23 08:10 edmundmiller

How does that impact offline use of pipelines? A mulled container can easily be downloaded with nf-core download. How about those built with Wave? Will this still work?

MatthiasZepper avatar Oct 18 '23 09:10 MatthiasZepper

How does that impact offline use of pipelines? A mulled container can easily be downloaded with nf-core download. How about those built with Wave? Will this still work?

That was @ewels concern as well! I think nextflow inspect might be able to handle this in theory.

But I think I misunderstood that the wave builds are happening locally, but I believe they're happening on a web server.

Ideally this would just be for building the containers, and then end users will be able to pull the container normally, unless they modify the environment.

edmundmiller avatar Oct 18 '23 10:10 edmundmiller

I believe this is a supported feature of Wave Freeze and nextflow inspect -concretize, however I think if it can't run offline this feature is a no-go, so we should triple check.

Build time is happening at the Wave server, but this should be functionally equivalent to running nf-core download. @MatthiasZepper, would you agree?

adamrtalbot avatar Oct 18 '23 10:10 adamrtalbot

I am not in Barcelona and probably somewhat out of the loop. Additionally, I have no experience using Wave, @mribeirodantas Bytesize talk is essentially all I know about it.

In general, I have no objections against abandoning Bioconda's CI in favour of Wave. Coincidentally, I learned yesterday why Oxford Nanopore has little confidence in Bioconda and that they are strongly favouring own distribution channels. Moving away from Bioconda allows ONT to support AArch64 natively and for a better provenance in dependencies. Taking into account the importance of ONT software in nf-core pipelines, I think that alone justifies allowing for other software sources and container registries in modules, including building them with Wave.

I believe this is a supported feature of Wave Freeze and nextflow inspect -concretize, however I think if it can't run offline this feature is a no-go, so we should triple check.

The ability to run offline is indeed paramount for us at NGI. We would need to maintain our own forks of the pipelines with patched modules, if that option was dropped from the nf-core pipelines/modules. From which registry the container is downloaded/pulled is pretty much irrelevant for us (a non-free/non-sponsored service might be a slight downer, though), but might matter for scientists in other countries that are under international sanctions (e.g. Iran, Russia etc.?)

I believe this is a supported feature of Wave Freeze and nextflow inspect -concretize. Build time is happening at the Wave server, but this should be functionally equivalent to running nf-core download.

Tools 2.10 is simply extracting the container definitions from the modules & configs with Regexes. Subsequently, it is downloading all that start with http:// paths and pulling the remainder. I am not aware of any code that would support Wave builds as of now.

But the decision of abandoning the regexes in favour of nextflow inspect for future releases is set in stone and I presume you are actually implementing that already @adamrtalbot & Marcel? In that case, nf-core download could probably also be enabled to resolve any Wave instructions and also kick off builds and obtain images from a dynamically created container URI if needed?

My only concern is that the pipeline, when executed, will not find the appropriate container without contacting the Wave server asking for its hash respectively the required layers. But I believe this concern is actionable by the Nextflow/Wave plugin developers?

MatthiasZepper avatar Oct 18 '23 12:10 MatthiasZepper

I'm also having trouble authenticating with the registry. I didn't think it would have to go through tower if it was public. @pditommaso could you confirm?

Currently, without credentials you will be limited to 25 builds per day. We are discussing how to allow a community registry for nf-core pipelines.

pditommaso avatar Oct 18 '23 16:10 pditommaso

Currently, without credentials you will be limited to 25 builds per day. We are discussing how to allow a community registry for nf-core pipelines.

Would a combination of Wave with Kraken be an option? There are enough academic institutions represented in the nf-core community that could sponsor a server/mirror or two, so we could host the images in a peer-to-peer network after they were build with Wave? Unfortunately, this does seemingly not support Singularity images...

MatthiasZepper avatar Oct 19 '23 09:10 MatthiasZepper

Okay I think this works pretty well. We can also swap to just plain docker builds if Wave is too much headache.

Only problem is I can't tell why wave isn't working. I'm guessing the tower instance doesn't have access to the GH container registry, but I'd hope it'd fail gracefully.

edmundmiller avatar Nov 02 '23 23:11 edmundmiller

Yes, it looks like the creds provided do not have enough permissions

error checking push permissions -- make sure you entered the correct tag name, and that you are authenticated correctly, and try again: checking push permission for "[docker.io/emiller88/modules:bowtie-1.3.0--6627003a47795810](http://docker.io/emiller88/modules:bowtie-1.3.0--6627003a47795810)": creating push check transport for [index.docker.io](http://index.docker.io/) failed: GET https://auth.docker.io/token?scope=repository%3Aemiller88%2Fmodules%3Apush%2Cpull&service=registry.docker.io: unexpected status code 401 Unauthorized: {"details":"access token has insufficient scopes"}

pditommaso avatar Nov 03 '23 05:11 pditommaso

Would a combination of Wave with Kraken be an option? There are enough academic institutions represented in the nf-core community that could sponsor a server/mirror or two, so we could host the images in a peer-to-peer network after they were build with Wave? Unfortunately, this does seemingly not support Singularity images...

@MatthiasZepper I had missed this. Kraken looks indeed an interesting option. In relation to Singularity as long as it's Kraken is OCI compliant it should work

pditommaso avatar Nov 03 '23 05:11 pditommaso

https://github.com/nextflow-io/nextflow/commit/f5362a7b067173a29d684663df22bb48fbbf5659

I think this might fix my singularity or docker hash issue? I thought it was already a thing for some reason.

edmundmiller avatar Nov 07 '23 23:11 edmundmiller

Ah we're waiting on 23.11.0-edge now, I thought it might have been released.

edmundmiller avatar Nov 22 '23 18:11 edmundmiller

Hello! We're in the process of migrating to nf-test!🚀

We would appreciate this test being rewritten in nf-test. We have documented a step-by-step process to do so here. If you aren't able to do that right now raise an issue.

github-actions[bot] avatar Nov 26 '23 14:11 github-actions[bot]

Ah, this might be dead in the water unless we spin up a singularity registry.

But it also doesn't look like wave supports any singularity registry

Could someone educate me on what was the issue with letting it pull the docker images? Space?

edmundmiller avatar Nov 30 '23 17:11 edmundmiller

@marcodelapierre any thoughts here? The singularity images are still failing because they can't find bowtie_build

edmundmiller avatar Feb 16 '24 15:02 edmundmiller

Nevermind after updating the wave version and Nextflow version everything worked!

edmundmiller avatar Feb 16 '24 17:02 edmundmiller

Things we'd need to do to use these in pipelines:

  1. Update nf-core tools linting
  2. Add singularity.ociAutoPull and apptainer.ociAutoPull
  3. Bump Nextflow versions to 23.12.0-edge

Bonus points:

  1. Use the biocontainers script to make containers public automatically
  2. Maybe consolidate images to be one environment per tool (ie we don't really need 2 different bowtie2 environments, it's just one extra dependancy)
  3. Point at a different container backend that has a better uptime.
  4. Clean up #4306 and make version bumping happen automatically.

edmundmiller avatar Feb 16 '24 17:02 edmundmiller

3. Bump Nextflow versions to 23.12.0-edge

I feel a bit nervous about this for production pipelines. Can do the prep work but might be nice to wait for the next stable release at least.

ewels avatar Feb 16 '24 19:02 ewels

Awesome work 🚀

mahesh-panchal avatar Feb 16 '24 19:02 mahesh-panchal

I feel a bit nervous about this for production pipelines. Can do the prep work but might be nice to wait for the next stable release at least.

Agreed, hopefully 24.04.0 will be out at the beginning of April rather than the end of April 😉

edmundmiller avatar Feb 19 '24 16:02 edmundmiller

I think we can resume this effort as of nextflow 24.03.0-edge nextflow support wave community registry which allows you to build images without providing a custom (password-protected) container registry. Just remove wave.build.repository in your setting

pditommaso avatar Apr 24 '24 17:04 pditommaso

https://github.com/seqeralabs/wave-cli/commit/c798f283f5d5c0b94de3f2a44455f6c68e4d7462 Might be useful.

Currently swapped the environment names:

+ name: htslib_samtools
- name: samtools_view

Personally think this is a better naming structure so we don't have an environment that could be the same. samtools_view samtools_sort, etc.

edmundmiller avatar Jun 11 '24 21:06 edmundmiller

setting Image
imageSuffix community.wave.seqera.io/library:htslib_samtools--9bc8b2f12954060b
tagPrefix community.wave.seqera.io/library/build:htslib_samtools--a81ec36ed9046273
none community.wave.seqera.io/library/build:htslib_samtools--a81ec36ed9046273
-- --
No env name community.wave.seqera.io/library/htslib_samtools:1.20--11a4e6daa46930ec
env name* community.wave.seqera.io/library/htslib_samtools:9bc8b2f12954060b
Seqera containers community.wave.seqera.io/library/htslib_samtools:1.20--11a4e6daa46930ec

So there are two options here.

  1. Change the Seqera containers default to remove the 1.20.
  2. Add the option to wave-cli to generate it with a version in the tag even with env_name.
  3. Get rid of the env names.

I think 2 is best option to avoid work/conda/environment-764620c334b2a2c557b6a7deec235610 which can make them difficult to find.

But I also see the arguement to drop the env_names, let Nextflow figure out if the environments are the same, and then we're future proofed against and Seqera containers default changes.

[!NOTE] Main reason for wanting to stay in line with the website is to avoid having duplicate containers for the same thing (ie htslib_samtools:1.20--11a4e6daa46930ec & htslib_samtools:9bc8b2f12954060b)

edmundmiller avatar Jun 12 '24 01:06 edmundmiller

🙃

https://github.com/nf-core/modules/issues/4234

That cuts out option 3.

I think this sounds like a Nextflow bug because the two modules should be able to use the same environment.

The plot thickens:

env name community.wave.seqera.io/library/bowtie_samtools:76ee1ea1b2f95f97
Seqera containers community.wave.seqera.io/library/bowtie_samtools:772b3bee982574e4

With bowtie_samtools, there's no version tag(1.3.1--hash) with Seqera containers(I suspect because htslib is baked into samtools in the samtools_view env?) but the hashes don't match?

edmundmiller avatar Jun 12 '24 03:06 edmundmiller

Okay, discussed with @pditommaso in Slack.

The issue with the inconsistent hash is the name field.

Determined the best way forward:

  1. Remove all the env names to match Seqera containers
  2. If the errors come back up, swap tests to use mamba instead of conda and hope that fixes the conda bug

2 did happen, we tested it in https://github.com/nf-core/sarek/pull/1564/ and https://github.com/nf-core/modules/pull/5802

Figured the easiest path is actually to just stop testing conda and pytest-workflow together. I think it's a worthy sacrifice to move this forward, since we're phasing pytest-workflow out soon(™️).

Next steps:

  • [x] Update tools linting to get rid of looking for a name field(In #5802)
  • [ ] Merge https://github.com/nf-core/modules/pull/5802
  • [ ] ~~Make a PR removing all the env names~~ Not necessary, we can remove them all after we're more confident.
  • [ ] Merge #4080 and see what starts breaking
  • [ ] Merge #4940

edmundmiller avatar Jun 12 '24 22:06 edmundmiller

Copying over dump from Slack:

Main thing is for Singularity: I would like to keep https:// downloads for Singularity .sif files if possible, so that we don't break nf-core download That's currently in-development for Wave / Wave-CLI and should hopefully be ready any day Having native Singularity images is also better because it's quicker / less compute resources for the end user rather than recompiling at run time

If we're doing this then we also need to figure out the best nextflow syntax for container in the processes, as we're back to having two separate strings

And finally, I would also like to be building for ARM as well where possible

ewels avatar Jun 13 '24 10:06 ewels

Yeah I was thinking about it during office-hours.

I think for now we keep the container directive magic the way it is. We're trying to do multiple steps at once, which wasn't the original goal of this PR.

  1. Swap biocontainers for Seqera containers
  2. Clean up the container directive and see if we can make it more minimal.
  3. Build for multi-arch and handle that in nf-core.

There's no way we can do all of those at once without something breaking.

I've added back in the old singularity containers now. There's nothing blocking this PR, unless anyone thinks Seqera containers is less reliable or is missing something compared to (mulled)biocontainers.

That's a separate discussion(and a good one to have), but not the point of this PR.

edmundmiller avatar Jun 18 '24 19:06 edmundmiller

It'd be nice to also write the container URIs to the meta.yml file. No problem for it to be duplicated for now, means we can just remove from the main.nf file later. See also https://github.com/nf-core/modules/pull/6659#issuecomment-2357587262

Note that we haven't decided on a spec for how this should look in the meta YAML file yet, so need to do that. Should be easy to agree though, I have a rough idea in my head already.

ewels avatar Sep 18 '24 14:09 ewels

We should add in ARM builds as well (may need to tolerate + log failures).

Sounds good! We can log it in the CI definitely, or maybe we output it to files like bioconda does.

Also, is it possible to parallelise the different requests in separate jobs? Would go 4x faster then.

Already is! It just skips the step depending on the matrix.

edmundmiller avatar Sep 20 '24 23:09 edmundmiller