bpipe icon indicating copy to clipboard operation
bpipe copied to clipboard

`transform` construct apparently has a bug when dealing with multiple files

Open zsteve opened this issue 7 years ago • 1 comments

Using transform in a bpipe script seems to yield an error when processing paired-end reads *.R1.fastq.gz and *.R2.fastq.gz. I've been able to replicate this simply (using latest version of bpipe 0.9.9.5)

In a directory, I have created two (empty) files: r1.gz r2.gz

Then in my bpipe script:

change_fnames = {
  transform('.gz', '.gz') to ('.processed', '.processed'){
    exec """echo change_fnames"""
    exec """mv $input1.gz $output1.processed"""
    exec """mv $input2.gz $output2.processed"""
  }
}

run {
  change_fnames
}

Running this via the command bpipe run pipeline_test.groovy r1.gz r2.gz

Results in the following output:

====================================================================================================
|                              Starting Pipeline at 2018-01-19 11:21                               |
====================================================================================================

======================================= Stage change_fnames ========================================
change_fnames
Cleaned up file r1.processed to .bpipe/trash/r1.processed.1
ERROR: stage change_fnames failed: Insufficient inputs: at least 2 inputs are expected with extension .gz but only 1 are available 


========================================= Pipeline Failed ==========================================

Insufficient inputs: at least 2 inputs are expected with extension .gz but only 1 are available

Use 'bpipe errors' to see output from failed commands.

Checking the directory after running, we end up with r1.gz 'missing' and r2.gz remaining untouched.

Bpipe seems like a good solution to the pipeline I need to implement, but at the moment this issue is a bit of a roadblock as it means filenames are going all over the place.

zsteve avatar Jan 19 '18 00:01 zsteve

Edit - how funny. I looked at the last issue and I figured out what was the problem. Would be good if this could be (conspiciously) added to the docs :)

change_fnames = {
  transform('*.gz') to ('.processed'){
    exec """echo change_fnames"""
    exec """mv $input1.gz $output1.processed"""
    exec """mv $input2.gz $output2.processed"""
  }
}

run {
  change_fnames
}

zsteve avatar Jan 19 '18 00:01 zsteve