nextflow icon indicating copy to clipboard operation
nextflow copied to clipboard

storeDir don't reproduce the correct output tuple in the output channel

Open courcelm opened this issue 2 years ago • 2 comments

Bug report

When storeDir is used with an output tuple, the channel doesn't produce the expected output.

Expected behavior and actual behavior

Output without storeDir:

[1, /home/X/work/e6/23dc34e4d6f87b5a6f6a3faa340a97/1.txt] [2, /home/X/work/00/a3bb70ef6ff1dddde4d0b9122cb0e6/2.txt]

Output with storeDir: [1, [/home/X/test/1.txt, /home/X/test/2.txt]] [2, [/home/X/test/1.txt, /home/X/test/2.txt]]

Steps to reproduce the problem

Without storeDir:

input = Channel.of(1,2)

process test {

input:
val x from input

output:
    tuple val(x), path('*.txt') into test_ch


"""
echo $x > ${x}.txt
"""

}

test_ch.view()

With storeDir:

input = Channel.of(1,2)

process test { storeDir "test"

input:
val x from input

output:
    tuple val(x), path('*.txt') into test_ch


"""
echo $x > ${x}.txt
"""

}

test_ch.view()

Program output

Output without storeDir:

[1, /home/X/work/e6/23dc34e4d6f87b5a6f6a3faa340a97/1.txt] [2, /home/X/work/00/a3bb70ef6ff1dddde4d0b9122cb0e6/2.txt]

Output with storeDir: [1, [/home/X/test/1.txt, /home/X/test/2.txt]] [2, [/home/X/test/1.txt, /home/X/test/2.txt]]

Environment

  • Nextflow version: 21.04.2
  • Java version: openjdk 11.0.13 2021-10-19
  • Operating system: Ubuntu 21.04
  • Bash version: 5.1.4

courcelm avatar Feb 08 '22 15:02 courcelm

This looks like a duplicate of https://github.com/nextflow-io/nextflow/issues/1299. I also encountered the same issue, but reading the docs on storeDir I think it's expected behaviour - the glob in the output declaration '*.txt' causes the process to pick up both stored files. I agree that ideally the behaviour of a process with or without storeDir should be identical, but I'm not sure this is possible since a process won't, in general, know what files are output a priori. This will be problematic for DSL2 since most nf-core modules glob output files.

The workaround that I used was to create unique storeDir directories for each process input using a closure in the process config, e.g.:

process {
    withName: FOO {
        storeDir = {"${params.storedir}/foo/${meta.id}"}
    }
}

I don't see this in the documentation, so I'm not sure it's explicitly permitted, but it works.

dancooke avatar Mar 02 '22 15:03 dancooke

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 31 '22 17:07 stale[bot]