Container cache download
Fixes #3019 and #3162.
First #3162. The code has to deal with an out_path (always defined) and sometimes a cache_path too. The implementation was slightly convoluted as it was doing fresh downloads into the cache_path or out_path (first decision point) and then doing an extra copy if needed (second decision point). Because of that confusion, symlinks across container registries were not all created across both locations.
I propose to reverse the logic to make it more straightforward:
- Always download into
out_pathand create its symlinks. - Then, optionally copy to
cache_path(and create symlinks there).
Then, #3019: I propose to handle the singularity "library" directory this way:
- Just like in Nextflow itself, it's considered a read-only location. It means that containers can only be copied from it, not to it, and that we shouldn't be even creating symlinks there.
- There is no point in having a
--container-library-utilisationparameter for the library because i)remotewould be redundant with the--container-cache-utilisation'sremotemode, ii)amendis not possible as per the read-only rule, so iii)copyis the only possible mode. - Therefore, the most natural place to use the library is as a source of containers, in parallel of https downloads and
singularity pull. WhenNXF_SINGULARITY_LIBRARYDIRis set and the container exists in the library, it is copied to the target directories (out_pathand possiblycache_pathtoo)
PR checklist
- [x] This comment contains a description of changes (with reason)
- [ ]
CHANGELOG.mdis updated - [ ] Unit-tests
- [ ] Documentation in
README.md
Codecov Report
Attention: Patch coverage is 0% with 16 lines in your changes missing coverage. Please review.
Project coverage is 75.52%. Comparing base (
8e47a33) to head (78650af).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| nf_core/pipelines/download.py | 0.00% | 16 Missing :warning: |
Additional details and impacted files
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
The code is there, but I still need to add unit-tests and documentation
@muffato - do you think you'll have time to take a look at this PR in the near(ish) future? It'd be great to get it moving and merged if possible..
Hi @ewels . Unfortunately there's too much work left to do on this PR. I can't see myself getting to the bottom of it any time soon.
No problem - hopefully @mirpedrol can take it over then when she has time, if that's ok 🙏🏻
No problem - hopefully @mirpedrol can take it over then when she has time, if that's ok 🙏🏻
Frankly, I consider this feature expendable. It is nice to save some download bandwidth and storage space, but the poor Seqera Containers support is haunting me much more (and also wasting significant storage).
Considering that Júlia has the most extensive knowledge of the nf-core tools codebase, I would rather suggest / entreat / beg her for working on tooling around the container.yml creation, linting and release logic. As soon as each new pipeline release ships with it, we can adopt it on the download side as well.