bpipe
bpipe copied to clipboard
`transform` construct apparently has a bug when dealing with multiple files
Using transform
in a bpipe script seems to yield an error when processing paired-end reads *.R1.fastq.gz
and *.R2.fastq.gz
. I've been able to replicate this simply (using latest version of bpipe 0.9.9.5)
In a directory, I have created two (empty) files:
r1.gz r2.gz
Then in my bpipe script:
change_fnames = {
transform('.gz', '.gz') to ('.processed', '.processed'){
exec """echo change_fnames"""
exec """mv $input1.gz $output1.processed"""
exec """mv $input2.gz $output2.processed"""
}
}
run {
change_fnames
}
Running this via the command
bpipe run pipeline_test.groovy r1.gz r2.gz
Results in the following output:
====================================================================================================
| Starting Pipeline at 2018-01-19 11:21 |
====================================================================================================
======================================= Stage change_fnames ========================================
change_fnames
Cleaned up file r1.processed to .bpipe/trash/r1.processed.1
ERROR: stage change_fnames failed: Insufficient inputs: at least 2 inputs are expected with extension .gz but only 1 are available
========================================= Pipeline Failed ==========================================
Insufficient inputs: at least 2 inputs are expected with extension .gz but only 1 are available
Use 'bpipe errors' to see output from failed commands.
Checking the directory after running, we end up with r1.gz
'missing' and r2.gz
remaining untouched.
Bpipe seems like a good solution to the pipeline I need to implement, but at the moment this issue is a bit of a roadblock as it means filenames are going all over the place.
Edit - how funny. I looked at the last issue and I figured out what was the problem. Would be good if this could be (conspiciously) added to the docs :)
change_fnames = {
transform('*.gz') to ('.processed'){
exec """echo change_fnames"""
exec """mv $input1.gz $output1.processed"""
exec """mv $input2.gz $output2.processed"""
}
}
run {
change_fnames
}