nextflow icon indicating copy to clipboard operation
nextflow copied to clipboard

Joining channels using maps as the key to join by can fail on resume

Open Midnighter opened this issue 1 year ago • 3 comments

Bug report

Expected behavior and actual behavior

When joining two channels, for example, on the first element which is often a map with sample meta information, I would expect to be able to resume my pipeline and all samples to be processed. Instead, it can happen that a large part of my samples are dropped in the join (presumably the elements suddenly mismatch).

Steps to reproduce the problem

This is a bit tricky to reproduce since it may require large input. I've written about it in more detail but the examples there do not reproduce the problem. However, several nf-core pipelines do seem to be affected.

I have a pipeline where this consistently happens on resume but I cannot share the data (FASTQ pairs) with you to reproduce this.

Program output

The output will look something like the following. Be aware that the output of MINIO and FASTQ_READCOUNT are joined and that before being interrupted the pipeline had already processed more than 20 samples.

[18/09acfc] process > MINIO (6061)                                  [100%] 78 of 78, cached: 78
[91/22cd44] process > FASTQ_READCOUNT (6061)                        [100%] 78 of 78, cached: 78
[e4/d5f672] process > QUALITY_CONTROL:FASTP (6061)                  [100%] 20 of 20, cached: 20
[97/92efc0] process > QUALITY_CONTROL:FASTQC (6061)                 [100%] 20 of 20, cached: 20

Environment

  • Nextflow version: 22.04.5 build 5708
  • Java version: openjdk version "11.0.16.1" 2022-08-12
  • Operating system: Linux 5.19.5-arch1-1
  • Bash version: 5.1.16

Midnighter avatar Sep 03 '22 14:09 Midnighter

Without further evidence, I'm enclined to think that's caused by a wrong pattern used by the pipeline. Is the map modified when passed around different processes?

pditommaso avatar Sep 06 '22 06:09 pditommaso

I will try to come up with a minimal example but what do you mean by a wrong pattern? On a single, complete run, the pipeline finishes as expected.

Midnighter avatar Sep 06 '22 07:09 Midnighter

The problem can arise modiying the map content across different processes

pditommaso avatar Sep 06 '22 12:09 pditommaso