Task running before a required predecessor task
I am using rc5 and running the GDC's transform.cwl with rabix:
https://github.com/NCI-GDC/gdc-dnaseq-cwl/blob/master/workflows/dnaseq/transform.cwl
$ ./rabix/rabix --basedir ./out gdc-dnaseq-cwl/workflows/dnaseq/transform.cwl gdc-dnaseq-cwl/workflows/dnaseq/NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.json
[2017-05-04 20:34:58.521] [INFO] Job root.samtools_bamtobam has started
[2017-05-04 20:34:58.585] [INFO] Pulling docker image quay.io/ncigdc/samtools:1
[2017-05-04 20:35:00.098] [INFO] Running command line: /usr/local/bin/samtools view -Shb /mnt/SCRATCH/NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.bam > /home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/samtools_bamtobam/NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.bam
[2017-05-04 20:35:45.232] [INFO] Job root.samtools_bamtobam has completed
[2017-05-04 20:35:46.023] [INFO] Job root.picard_validatesamfile_original has started
[2017-05-04 20:35:46.028] [INFO] Job root.biobambam_bamtofastq has started
[2017-05-04 20:35:46.039] [INFO] Job root.bam_readgroup_to_json has started
[2017-05-04 20:35:46.062] [INFO] Pulling docker image quay.io/ncigdc/bam_readgroup_to_json:1
[2017-05-04 20:35:46.075] [INFO] Pulling docker image quay.io/ncigdc/biobambam:1
[2017-05-04 20:35:46.075] [INFO] Pulling docker image quay.io/ncigdc/picard:1
[2017-05-04 20:35:46.085] [INFO] Running command line: /usr/local/bin/bam_readgroup_to_json --bam_path /home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/samtools_bamtobam/NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.bam --mode lenient
[2017-05-04 20:35:46.119] [INFO] Running command line: java -jar /usr/local/bin/picard.jar ValidateSamFile OUTPUT=NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.bam.metrics INPUT=/home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/samtools_bamtobam/NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.bam MAX_OUTPUT=2147483647 MODE=VERBOSE TMP_DIR=. VALIDATION_STRINGENCY=LENIENT
[2017-05-04 20:35:46.145] [INFO] Running command line: /usr/local/bin/bamtofastq T=tempfq collate=1 exclude=QCFAIL,SECONDARY,SUPPLEMENTARY filename=/home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/samtools_bamtobam/NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.bam gz=1 inputformat=bam level=5 outputdir=. outputperreadgroup=1 outputperreadgroupsuffixF=_1.fq.gz outputperreadgroupsuffixF2=_2.fq.gz outputperreadgroupsuffixO=_o1.fq.gz outputperreadgroupsuffixO2=_o2.fq.gz outputperreadgroupsuffixS=_s.fq.gz tryoq=1
[2017-05-04 20:35:48.510] [INFO] Job root.bam_readgroup_to_json has completed
[2017-05-04 20:35:49.017] [INFO] Job root.readgroup_json_db.1 has started
[2017-05-04 20:35:49.039] [INFO] Pulling docker image quay.io/ncigdc/readgroup_json_db:1
[2017-05-04 20:35:49.060] [INFO] Running command line: /usr/local/bin/readgroup_json_db --json_path /home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/bam_readgroup_to_json/SRR622461.json --uuid demo
[2017-05-04 20:35:52.973] [INFO] Job root.readgroup_json_db.1 has completed
[2017-05-04 20:35:54.016] [INFO] Job root.merge_readgroup_json_db has started
[2017-05-04 20:35:54.041] [INFO] Pulling docker image quay.io/ncigdc/merge_sqlite:3
[2017-05-04 20:35:54.063] [INFO] Running command line: /usr/local/bin/merge_sqlite --source_sqlite /home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/readgroup_json_db/1/demo.db --uuid demo
[2017-05-04 20:35:55.927] [INFO] Job root.merge_readgroup_json_db has completed
[2017-05-04 20:36:30.328] [INFO] Job root.picard_validatesamfile_original has completed
[2017-05-04 20:36:31.018] [INFO] Job root.picard_validatesamfile_original_to_sqlite has started
[2017-05-04 20:36:31.023] [INFO] Pulling docker image quay.io/ncigdc/picard_metrics_sqlite:1
[2017-05-04 20:36:31.055] [INFO] Running command line: /usr/local/bin/picard_metrics_sqlite --metric_name ValidateSamFile --bam NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.bam --input_state original --metric_path /home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/picard_validatesamfile_original/NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.bam.metrics --uuid demo
[2017-05-04 20:36:33.830] [INFO] Job root.picard_validatesamfile_original_to_sqlite has completed
[2017-05-04 20:37:02.983] [INFO] Job root.biobambam_bamtofastq has completed
[2017-05-04 20:37:03.128] [INFO] Job root.remove_duplicate_fastq2.1 has started
[2017-05-04 20:37:03.131] [INFO] Job root.remove_duplicate_fastq1.1 has started
[2017-05-04 20:37:03.134] [INFO] Pulling docker image quay.io/ncigdc/fastq_remove_duplicate_qname:1
[2017-05-04 20:37:03.139] [INFO] Job root.remove_duplicate_fastq_s.1 has started
[2017-05-04 20:37:03.142] [INFO] Pulling docker image quay.io/ncigdc/fastq_remove_duplicate_qname:1
[2017-05-04 20:37:03.147] [INFO] Job root.remove_duplicate_fastq_o1.1 has started
[2017-05-04 20:37:03.151] [INFO] Pulling docker image quay.io/ncigdc/fastq_remove_duplicate_qname:1
[2017-05-04 20:37:03.157] [INFO] Running command line: bash -c 'zcat /home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/biobambam_bamtofastq/SRR622461_1.fq.gz | /usr/local/bin/fastq_remove_duplicate_qname - | gzip - > SRR622461_1.fq.gz'
[2017-05-04 20:37:03.170] [INFO] Running command line: bash -c 'zcat /home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/biobambam_bamtofastq/SRR622461_2.fq.gz | /usr/local/bin/fastq_remove_duplicate_qname - | gzip - > SRR622461_2.fq.gz'
[2017-05-04 20:37:03.173] [INFO] Pulling docker image quay.io/ncigdc/fastq_remove_duplicate_qname:1
[2017-05-04 20:37:03.188] [INFO] Running command line: bash -c 'zcat /home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/biobambam_bamtofastq/SRR622461_s.fq.gz | /usr/local/bin/fastq_remove_duplicate_qname - | gzip - > SRR622461_s.fq.gz'
[2017-05-04 20:37:03.202] [INFO] Running command line: bash -c 'zcat /home/mbookman/out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/biobambam_bamtofastq/SRR622461_o1.fq.gz | /usr/local/bin/fastq_remove_duplicate_qname - | gzip - > SRR622461_o1.fq.gz'
[2017-05-04 20:37:04.137] [INFO] Job root.sort_scattered_fastq_o2 has started
[2017-05-04 20:37:04.155] [INFO] Job root.sort_scattered_fastq_o2 will be executed without a container. It is an expression tool.
[2017-05-04 20:37:05.666] [INFO] Job root.sort_scattered_fastq_o2 has completed
[2017-05-04 20:37:06.119] [INFO] Job root.decider_bwa_o2 has started
[2017-05-04 20:37:06.128] [INFO] Job root.fastqc_o2.1 has started
[2017-05-04 20:37:06.141] [INFO] Job root.decider_bwa_o2 will be executed without a container. It is an expression tool.
[2017-05-04 20:37:06.158] [INFO] Pulling docker image quay.io/ncigdc/fastqc:1
[2017-05-04 20:37:06.176] [INFO] Running command line: /usr/local/FastQC/fastqc --dir . --format fastq --noextract --outdir . --threads 8
[2017-05-04 20:37:07.705] [INFO] Job root.remove_duplicate_fastq_o1.1 has completed
[2017-05-04 20:37:07.874] [ERROR] Failed to execute status command for ddeb3bb0-46a5-4e49-bb1d-7b2dc80297aa. Could not collect outputs.
org.rabix.executor.ExecutorException: Could not collect outputs.
at org.rabix.executor.handler.impl.JobHandlerImpl.postprocess(JobHandlerImpl.java:352) ~[rabix-backend-local.jar:na]
at org.rabix.executor.execution.command.StatusCommand.run(StatusCommand.java:53) ~[rabix-backend-local.jar:na]
at org.rabix.executor.execution.JobHandlerCommand.run(JobHandlerCommand.java:51) [rabix-backend-local.jar:na]
at org.rabix.executor.execution.JobHandlerRunnable.run(JobHandlerRunnable.java:60) [rabix-backend-local.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: org.rabix.bindings.BindingException: Failed to populate outputs
at org.rabix.bindings.cwl.CWLProcessor.postprocess(CWLProcessor.java:130) ~[rabix-backend-local.jar:na]
at org.rabix.bindings.cwl.CWLBindings.postprocess(CWLBindings.java:83) ~[rabix-backend-local.jar:na]
at org.rabix.executor.handler.impl.JobHandlerImpl.postprocess(JobHandlerImpl.java:330) ~[rabix-backend-local.jar:na]
... 6 common frames omitted
Caused by: org.rabix.bindings.cwl.expression.CWLExpressionException: Failed evaluating expression ${
function include(arr,obj) {
return (arr.indexOf(obj) != -1)
}
function endsWith(str, suffix) {
return str.indexOf(suffix, str.length - suffix.length) !== -1;
}
function local_basename(path) {
var basename = path.split(/[\\/]/).pop();
return basename
}
function local_dirname(path) {
return path.replace(/\\/g,'/').replace(/\/[^\/]*$/, '');
}
function get_slice_number(fastq_name) {
if (endsWith(fastq_name, '_1.fq.gz')) {
return -8
}
else if (endsWith(fastq_name, '_s.fq.gz')) {
return -8
}
else if (endsWith(fastq_name, '_o1.fq.gz')) {
return -9
}
else if (endsWith(fastq_name, '_o2.fq.gz')) {
return -9
}
else {
exit()
}
}
// get predicted readgroup basenames from fastq
var readgroup_basename_array = [];
for (var i = 0; i < inputs.fastq_path.length; i++) {
var fq_path = inputs.fastq_path[i];
var fq_name = local_basename(fq_path.location);
var slice_number = get_slice_number(fq_name);
var readgroup_name = fq_name.slice(0,slice_number) + ".json";
readgroup_basename_array.push(readgroup_name);
}
// find which readgroup items are in predicted basenames
var readgroup_array = [];
for (var i = 0; i < inputs.readgroup_path.length; i++) {
var readgroup_path = inputs.readgroup_path[i];
var readgroup_basename = local_basename(readgroup_path.location);
if (include(readgroup_basename_array, readgroup_basename)) {
readgroup_array.push(readgroup_path);
}
}
var readgroup_sorted = readgroup_array.sort(function(a,b) { return a.location > b.location ? 1 : (a.location < b.location ? -1 : 0) })
return {'output_readgroup_paths': readgroup_sorted}
}
.
at org.rabix.bindings.cwl.expression.javascript.CWLExpressionJavascriptResolver.evaluate(CWLExpressionJavascriptResolver.java:89) ~[rabix-backend-local.jar:na]
at org.rabix.bindings.cwl.CWLProcessor.postprocess(CWLProcessor.java:127) ~[rabix-backend-local.jar:na]
... 8 common frames omitted
<removed repeated stack traces>
[2017-05-04 20:37:07.889] [INFO] Failed to execute status command for ddeb3bb0-46a5-4e49-bb1d-7b2dc80297aa. Could not collect outputs.
[2017-05-04 20:37:07.899] [WARN] Job root.decider_bwa_o2, rootId: 2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c failed: Failed to execute status command for ddeb3bb0-46a5-4e49-bb1d-7b2dc80297aa. Could not collect outputs.
[2017-05-04 20:37:07.930] [WARN] Root job 2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c failed.
[2017-05-04 20:37:08.115] [INFO] Job root.sort_scattered_fastq_o1 has started
[2017-05-04 20:37:08.129] [INFO] Job root.sort_scattered_fastq_o1 will be executed without a container. It is an expression tool.
[2017-05-04 20:37:08.134] [INFO] Failed to execute a Job
$ ls -ltr out/2f09d953-e255-45ea-8fe1-e1b7fdf5dc0c/root/
total 60
drwxr-xr-x 2 mbookman docker 4096 May 4 20:35 samtools_bamtobam
drwxr-xr-x 2 mbookman docker 4096 May 4 20:35 bam_readgroup_to_json
drwxr-xr-x 4 mbookman docker 4096 May 4 20:35 readgroup_json_db
drwxr-xr-x 2 mbookman docker 4096 May 4 20:35 merge_readgroup_json_db
drwxrwxrwx 2 mbookman docker 4096 May 4 20:36 picard_validatesamfile_original
drwxr-xr-x 3 mbookman docker 4096 May 4 20:36 picard_validatesamfile_original_to_sqlite
drwxr-xr-x 2 mbookman docker 4096 May 4 20:37 biobambam_bamtofastq
drwxr-xr-x 4 mbookman docker 4096 May 4 20:37 remove_duplicate_fastq2
drwxr-xr-x 4 mbookman docker 4096 May 4 20:37 remove_duplicate_fastq1
drwxr-xr-x 4 mbookman docker 4096 May 4 20:37 remove_duplicate_fastq_s
drwxr-xr-x 4 mbookman docker 4096 May 4 20:37 remove_duplicate_fastq_o1
drwxr-xr-x 2 mbookman docker 4096 May 4 20:37 sort_scattered_fastq_o2
drwxr-xr-x 2 mbookman docker 4096 May 4 20:37 decider_bwa_o2
drwxr-xr-x 4 mbookman docker 4096 May 4 20:37 fastqc_o2
drwxr-xr-x 2 mbookman docker 4096 May 4 20:37 sort_scattered_fastq_o1
What I observe in the output is that remove_duplicate_fastq_o2 did not run, but sort_scattered_fastq_o2 (which depends on it) did. That seems to be why it failed to "collect output".
Here are the relevant bits from the CWL:
- id: remove_duplicate_fastq_o2
run: ../../tools/fastq_remove_duplicate_qname.cwl
scatter: INPUT
in:
- id: INPUT
source: biobambam_bamtofastq/output_fastq_o2
out:
- id: OUTPUT
...
- id: sort_scattered_fastq_o2
run: ../../tools/sort_scatter_expression.cwl
in:
- id: INPUT
source: remove_duplicate_fastq_o2/OUTPUT
out:
- id: OUTPUT
Using the cwltool gets well past this step, although it is running each step sequentially. It is not trying to build a graph and run in parallel. Still the CWL looks ok to me.
Hi, it seems that remove_duplicate_fastq_o2 received an empty list at its input and was completed without going to the executor.
In that case the workflow went like this:
[] -> remove_duplicate_fastq_o2 - [[]] -> sort_scattered_fastq_o2 - [[]] -> decider_bwa_o2
And then while evaluating decider_bwa_expression.cwl the executor failed.
Most likely the issue is with the empty output of remove_duplicate_fastq_o2 being wrapped inside of another array and decider_bwa_expression only working with empty lists or lists of files, throwing an exception in this case.
Will keep investigating and will perhaps create a pull request with a quick fix to output wrapping today.
This is a regression, on our part. Scattering on empty lists worked properly at some point. I've submitted new conformance tests to check for these cases: common-workflow-language/common-workflow-language#435
Hi, we have made a new release (v1.0.2) just now, this fix should be included inside.