nextflow
nextflow copied to clipboard
Joining channels using maps as the key to join by can fail on resume
Bug report
Expected behavior and actual behavior
When joining two channels, for example, on the first element which is often a map with sample meta information, I would expect to be able to resume my pipeline and all samples to be processed. Instead, it can happen that a large part of my samples are dropped in the join (presumably the elements suddenly mismatch).
Steps to reproduce the problem
This is a bit tricky to reproduce since it may require large input. I've written about it in more detail but the examples there do not reproduce the problem. However, several nf-core pipelines do seem to be affected.
I have a pipeline where this consistently happens on resume but I cannot share the data (FASTQ pairs) with you to reproduce this.
Program output
The output will look something like the following. Be aware that the output of MINIO
and FASTQ_READCOUNT
are joined and that before being interrupted the pipeline had already processed more than 20 samples.
[18/09acfc] process > MINIO (6061) [100%] 78 of 78, cached: 78
[91/22cd44] process > FASTQ_READCOUNT (6061) [100%] 78 of 78, cached: 78
[e4/d5f672] process > QUALITY_CONTROL:FASTP (6061) [100%] 20 of 20, cached: 20
[97/92efc0] process > QUALITY_CONTROL:FASTQC (6061) [100%] 20 of 20, cached: 20
Environment
- Nextflow version: 22.04.5 build 5708
- Java version: openjdk version "11.0.16.1" 2022-08-12
- Operating system: Linux 5.19.5-arch1-1
- Bash version: 5.1.16
Without further evidence, I'm enclined to think that's caused by a wrong pattern used by the pipeline. Is the map modified when passed around different processes?
I will try to come up with a minimal example but what do you mean by a wrong pattern? On a single, complete run, the pipeline finishes as expected.
The problem can arise modiying the map content across different processes