nextflow
nextflow copied to clipboard
Cache invalidated when $launchDir is used (no resume possible)
Bug report
I am not sure if this is a bug or feature, but it's definitely worth to mention, at least on https://www.nextflow.io/blog/2019/troubleshooting-nextflow-resume.html because I think that this behaviour is a bit unintuitive and took me quite a time to debug (I read through a lot of hash-trace diffs 🙈 )
When using the $launchDir
meta variable (or $workflow.launchDir
), the cache is somehow affected and not working when using -resume
. I don't understand how this is achieved, but obviously nextflow
knows that it's a path and treats it as an input. If the modification time of the folder has changed (which happens e.g. if a new file is created inside the folder), the cache is invalidated. Since nextflow
writes its log-files by default in $launchDir/.nextflow.log*
, no process can be resumed which is somehow using the $launchDir
variable.
Expected behavior and actual behavior
Running the workflow multiple times with -resume
does not use the cache when $launchDir
is appearing in the script
section. The $launchDir
should not have any impact on the cache ingredients (hashes and timestamps of inputs/outputs).
Steps to reproduce the problem
Run the workflow below with nextflow run workflow.nf -resume
multiple times, to see that the cache is not working.
#!/usr/bin/env nextflow
nextflow.enable.dsl = 2
process Process {
input:
val(input)
output:
file('*.txt')
script:
"""
echo $workflow.launchDir
touch "a.txt"
"""
}
workflow {
foo = Channel.from(1,2,3)
Process(foo)
}
Program output
░ tgal@cca008:/sps/km3net/users/tgal/tmp/nextflow-cache
░ 18:38:25 > nextflow run workflow.nf -resume
N E X T F L O W ~ version 21.10.3
Launching `workflow2.nf` [grave_waddington] - revision: 3ee6c5a013
executor > local (3)
[49/67ffb1] process > Process (1) [100%] 3 of 3 ✔
░ tgal@cca008:/sps/km3net/users/tgal/tmp/nextflow-cache
░ 18:38:54 > nextflow run workflow.nf -resume
N E X T F L O W ~ version 21.10.3
Launching `workflow2.nf` [kickass_volhard] - revision: 3ee6c5a013
executor > local (3)
[f6/5b035c] process > Process (3) [100%] 3 of 3 ✔
Environment
- Nextflow version: 21.10.3.5655
- Java version: openjdk version "1.8.0_312", openjdk version "11.0.12" 2021-07-20 LTS
- Operating system: macOS Big Sur, CentOS 7, ArchLinux (2021.09)
- Bash version: zsh 5.0.2 (x86_64-redhat-linux-gnu), zsh 5.8 (x86_64-apple-darwin20.0)
I have this same issue.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This happens because the launchDir
is a directory Path. Every time the execution is launched, it will have different content and therefore it will produce a different hashing, causing the cache invalidation.
It you want to prevent that use workflow.launchDir.toString()