wgbs-pipeline
wgbs-pipeline copied to clipboard
File latency issue?
Describe the bug
The pipeline creates a file that it doesn't think exists. In the example below, one case of this would be /redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/inputs/667055062/indexes.tar.gz
, which does exist when I check after the fact.
This is running on a compute cluster. I've repeated this multiple times, so it's not just a one-off error.
My first guess it that this is due to file system latency. If this were Snakemake, I'd know that I just needed to increase the value passed to --latency-wait
, but I'm not sure how to set something like that for this pipeline.
OS/Platform
- OS/Platform: Ubuntu 20.04.6 LTS
- Pipeline version: 1.1.7
- Caper version: 2.3.2
Caper configuration file
backend=slurm
# SLURM partition. DEFINE ONLY IF REQUIRED BY YOUR CLUSTER'S POLICY.
# You must define it for Stanford Sherlock.
slurm-partition=serial
# SLURM account. DEFINE ONLY IF REQUIRED BY YOUR CLUSTER'S POLICY.
# You must define it for Stanford SCG.
slurm-account=
# Local directory for localized files and Cromwell's intermediate files.
# If not defined then Caper will make .caper_tmp/ on CWD or `local-out-dir`.
# /tmp is not recommended since Caper store localized data files here.
local-loc-dir=
cromwell=/redacted/caper/2.3.2/jars/cromwell-82.jar
womtool=/redacted/caper/2.3.2/jars/womtool-82.jar
Input JSON file
{
"wgbs.benchmark_mode": true,
"wgbs.extra_reference": "/redacted2/encode-wgbs/wgbs-pipeline/tests/data/conversion_control.fa.gz",
"wgbs.fastqs": [
[
[
"/redacted2/encode-wgbs/wgbs-pipeline/tests/data/sample5_data_1_200000.fastq.gz",
"/redacted2/encode-wgbs/wgbs-pipeline/tests/data/sample5_data_2_200000.fastq.gz"
]
]
],
"wgbs.reference": "/redacted2/encode-wgbs/wgbs-pipeline/tests/data/sacCer3.fa.gz",
"wgbs.sample_names": [
"sample5"
],
"wgbs.underconversion_sequence_name": "NC_001416.1"
}
Error log
2023-10-04 21:10:58,310|caper.cli|INFO| Cromwell stdout: /redacted/wgbs_test/cromwell.out.1
2023-10-04 21:10:58,315|caper.caper_base|INFO| Creating a timestamped temporary directory. /redacted/wgbs_test/.caper_tmp/wgbs-pipeline/20231004_211058_313834
2023-10-04 21:10:58,315|caper.caper_runner|INFO| Localizing files on work_dir. /redacted/wgbs_test/.caper_tmp/wgbs-pipeline/20231004_211058_313834
2023-10-04 21:10:58,686|caper.caper_workflow_opts|INFO| Singularity image found in WDL metadata. wdl=/redacted2/encode-wgbs/1.1.8/wgbs-pipeline.wdl, s=docker://encodedcc/wgbs-pipeline:1.1.7
2023-10-04 21:10:58,706|caper.cromwell|INFO| Validating WDL/inputs/imports with Womtool...
2023-10-04 21:11:04,509|caper.nb_subproc_thread|INFO| Subprocess finished successfully.
2023-10-04 21:11:04,510|caper.cromwell|INFO| Passed Womtool validation.
2023-10-04 21:11:04,510|caper.caper_runner|INFO| launching run: wdl=/redacted2/encode-wgbs/1.1.8/wgbs-pipeline.wdl, inputs=/redacted/wgbs_test/test_wgbs.json, backend_conf=/redacted/wgbs_test/.caper_tmp/wgbs-pipeline/20231004_211058_313834/backend.co
2023-10-04 21:11:15,543|caper.cromwell_workflow_monitor|INFO| Workflow: id=91e49ae4-9226-4824-af45-301fc1a815e8, status=Submitted
2023-10-04 21:11:15,605|caper.cromwell_workflow_monitor|INFO| Workflow: id=91e49ae4-9226-4824-af45-301fc1a815e8, status=Running
2023-10-04 21:11:23,864|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.make_conf:-1, retry=0, status=CallCached
2023-10-04 21:11:26,814|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.make_metadata_csv:-1, retry=0, status=CallCached
2023-10-04 21:11:29,834|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.index_reference:-1, retry=0, status=CallCached
2023-10-04 21:11:35,809|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.prepare:-1, retry=0, status=CallCached
2023-10-04 21:11:44,809|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=0, status=Started, job_id=2081286
2023-10-04 21:11:44,837|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=0, status=Running
2023-10-04 21:11:51,943|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=0, status=Done
2023-10-04 21:11:59,788|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=1, status=Started, job_id=2081287
2023-10-04 21:11:59,796|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=1, status=Running
2023-10-04 21:12:04,068|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=1, status=Done
2023-10-04 21:12:05,042|caper.cromwell_workflow_monitor|INFO| Workflow: id=91e49ae4-9226-4824-af45-301fc1a815e8, status=Failed
2023-10-04 21:12:15,586|caper.cromwell_metadata|INFO| Wrote metadata file. /redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/metadata.json
2023-10-04 21:12:15,587|caper.cromwell|INFO| Workflow failed. Auto-troubleshooting...
2023-10-04 21:12:15,589|caper.nb_subproc_thread|ERROR| Cromwell failed. returncode=1
2023-10-04 21:12:15,589|caper.cli|ERROR| Check stdout in /redacted/wgbs_test/cromwell.out.1
* Started troubleshooting workflow: id=91e49ae4-9226-4824-af45-301fc1a815e8, status=Failed
* Found failures JSON object.
[
{
"causedBy": [
{
"message": "Job wgbs.map:0:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.",
"causedBy": []
}
],
"message": "Workflow failed"
}
]
* Recursively finding failures in calls (tasks)...
==== NAME=wgbs.map, STATUS=RetryableFailure, PARENT=
SHARD_IDX=0, RC=1, JOB_ID=2081286
START=2023-10-04T21:11:41.051Z, END=2023-10-04T21:11:54.791Z
STDOUT=/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/stdout
STDERR=/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/stderr
STDERR_CONTENTS=
tar: /redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/inputs/667055062/indexes.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/mapping/**/*.bam': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/mapping/**/*.csi': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/mapping/**/*.bam.md5': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/mapping/**/*.json': No such file or directory
==== NAME=wgbs.map, STATUS=Failed, PARENT=
SHARD_IDX=0, RC=1, JOB_ID=2081287
START=2023-10-04T21:11:55.035Z, END=2023-10-04T21:12:04.072Z
STDOUT=/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/stdout
STDERR=/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/stderr
STDERR_CONTENTS=
tar: /redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/inputs/667055062/indexes.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/mapping/**/*.bam': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/mapping/**/*.csi': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/mapping/**/*.bam.md5': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/mapping/**/*.json': No such file or directory