cromwell
cromwell copied to clipboard
Occasional "Failed to read_lines" with IOException
We run custom WDL genomics pipelines and also parts of the BioWDL somaticcalling pipeline (https://biowdl.github.io/) using the broadinstitute/cromwell:64 cromwell container with a gridengine backend.
Occasionally we see errors like the below:
cromwell_1 | 2023-03-20 14:56:42,590 cromwell-system-akka.dispatchers.engine-dispatcher-193 INFO - WorkflowManagerActor: Workflow 3f4aa8de-d1a9-419a-b9b4-10f9ed0a9d53 failed (during ExecutingWorkflowState): cromwell.backend.standard.StandardAsyncExecutionActor$$anon$2: Failed to evaluate job outputs: cromwell_1 | Bad output 'scatterList.scatters': Failed to read_lines("/home/devarea/karl/PathoCromwell/cromwell-executions/Agilent_Exome_Single/3f4aa8de-d1a9-419a-b9b4-10f9ed0a9d53/call-SomaticVariants/SomaticVariantcalling/0c6cefa3-4c84-497f-8003-b0d7221bedbc/call-mutect2/Mutect2/fb66e417-4c06-4f15-8607-da8261e16448/call-scatterList/execution/stdout") (reason 1 of 1): [Attempted 1 time(s)] - IOException: Could not read from /home/devarea/karl/PathoCromwell/cromwell-executions/Agilent_Exome_Single/3f4aa8de-d1a9-419a-b9b4-10f9ed0a9d53/call-SomaticVariants/SomaticVariantcalling/0c6cefa3-4c84-497f-8003-b0d7221bedbc/call-mutect2/Mutect2/fb66e417-4c06-4f15-8607-da8261e16448/call-scatterList/execution/stdout: /home/devarea/karl/PathoCromwell/cromwell-executions/Agilent_Exome_Single/3f4aa8de-d1a9-419a-b9b4-10f9ed0a9d53/call-SomaticVariants/SomaticVariantcalling/0c6cefa3-4c84-497f-8003-b0d7221bedbc/call-mutect2/Mutect2/fb66e417-4c06-4f15-8607-da8261e16448/call-scatterList/execution/stdout cromwell_1 | at cromwell.backend.standard.StandardAsyncExecutionActor.$anonfun$handleExecutionSuccess$1(StandardAsyncExecutionActor.scala:973) cromwell_1 | at scala.util.Success.$anonfun$map$1(Try.scala:255) cromwell_1 | at scala.util.Success.map(Try.scala:213) cromwell_1 | at scala.concurrent.Future.$anonfun$map$1(Future.scala:292) cromwell_1 | at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33) cromwell_1 | at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33) cromwell_1 | at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) cromwell_1 | at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) cromwell_1 | at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
Basically cromwell fails to read a stdout file written by a job. When we check the file it exists and contains data so i suspect some kind of IO problem. All Data is on NFS shares which are usually quite stable and we see no errors in any of the filesystem/nfs backends. It seems to mostly happen in this scatter step, so i suspect some race condition or timeout somewhere in there, however, this job was creating a scatter to 12 bed files, so its really not that big . Just re-running the job usually works, but its extremely annoying.
Call caching is disabled:
call-caching { enabled = false }
Any idea what to do ?
The reason for this problem may be that in the nfs mode, the writing of ’stdout‘ file is asynchronous; after the ’script‘ script is executed, the ‘rc’ file is scanned by cromwell akka, and the ‘stdout’ file may still be writing data at this time (memory data writing to disk), so this problem occurs;
Work around solution: After the ’script‘ script is executed, move the ‘rc’ file to another directory, sleep for a few seconds (the specific time should be adjusted according to the nfs network situation) and then move the rc file to the cromwell standard directory
I had the same problem. The code that generates stdout
and stderr
is included in StandardAsyncExecutionActor.scala:
// The `tee` trickery below is to be able to redirect to known filenames for CWL while also streaming
// stdout and stderr for PAPI to periodically upload to cloud storage.
// https://stackoverflow.com/questions/692000/how-do-i-write-stderr-to-a-file-while-using-tee-with-a-pipe
(errorOrDirectoryOutputs, errorOrGlobFiles).mapN((directoryOutputs, globFiles) =>
s"""|#!$jobShell
|DOCKER_OUTPUT_DIR_LINK
|cd ${cwd.pathAsString}
|tmpDir=$temporaryDirectory
|$tmpDirPermissionsAdjustment
|export _JAVA_OPTIONS=-Djava.io.tmpdir="$$tmpDir"
|export TMPDIR="$$tmpDir"
|export HOME="$home"
|(
|cd ${cwd.pathAsString}
|SCRIPT_PREAMBLE
|)
|$out="$${tmpDir}/out.$$$$" $err="$${tmpDir}/err.$$$$"
|mkfifo "$$$out" "$$$err"
|trap 'rm "$$$out" "$$$err"' EXIT
|touch $stdoutRedirection $stderrRedirection
|tee $stdoutRedirection < "$$$out" &
|tee $stderrRedirection < "$$$err" >&2 &
|(
|cd ${cwd.pathAsString}
|ENVIRONMENT_VARIABLES
|INSTANTIATED_COMMAND
|) $stdinRedirection > "$$$out" 2> "$$$err"
|echo $$? > $rcTmpPath
|$emptyDirectoryFillCommand
|(
|cd ${cwd.pathAsString}
|SCRIPT_EPILOGUE
|${globScripts(globFiles)}
|${directoryScripts(directoryOutputs)}
|)
|mv $rcTmpPath $rcPath
|""".stripMargin
.replace("SCRIPT_PREAMBLE", scriptPreamble)
.replace("ENVIRONMENT_VARIABLES", environmentVariables)
.replace("INSTANTIATED_COMMAND", commandString)
.replace("SCRIPT_EPILOGUE", scriptEpilogue)
.replace("DOCKER_OUTPUT_DIR_LINK", dockerOutputDir))
}
With SCRIPT_EPILOGUE
set to default to sync
and modifiable with the script-epilogue
variable in the configuration (this is not explained in the Cromwell documentation but it is explained here). Maybe the problem could have been solved by also replacing $stdoutRedirection
and $stderrRedirection
with something like $stdoutRedirectionTmp
and $stderrRedirectionTmp
and then replace:
mv $rcTmpPath $rcPath
with:
mv $stdoutRedirectionTmp $stdoutRedirection
mv $stderrRedirectionTmp $stderrRedirection
mv $rcTmpPath $rcPath
This way stdout
and stderr
would have been created in the NFS filesystem at the same time as the rc
file and would increase the likelihood that they would all have been synced at the same time. However, this would not give the intended behavior when running in Google Cloud
Another problem that I have noticed is that there are multiple places in the Cromwell documentation that advise, when running Cromwell with SLURM, to use configurations such as (see here):
sbatch \
-o ${cwd}/execution/stdout \
-e ${cwd}/execution/stderr \
or as (see here, or here, or here, or here)
sbatch \
-o ${out} \
-e ${err} \
which overwrites stdout
and stderr
written by the script
file, which seems inappropriate. The following should have been used instead:
sbatch \
-o ${out}.sbatch \
-e ${err}.sbatch \
Similarly to how it is advised for SGE where ${out}.qsub
and ${err}.qsub
are used in place of ${out}
and ${err}
The current workaround suggested by @honestAnt is instead to use in the Cromwell configuration file something like this:
submit-docker = """
...
sbatch \
--wait \
-J=${job_name} \
-D ${cwd} \
-o ${out}.sbatch \
-e ${err}.sbatch \
-t ${runtime_minutes} \
-c ${cpu} \
--mem=${memory_mb} \
--wrap "singularity exec --containall --bind ${cwd}:${docker_cwd} docker://${docker} ${job_shell} ${docker_script}"
mv ${cwd}/execution/rc ${cwd}/execution/rc.tmp
sleep 60
mv ${cwd}/execution/rc.tmp ${cwd}/execution/rc
"""
A better alternative would be to use in the Cromwell configuration file something like this (as suggested here):
script-epilogue = "sleep 60 && sync"
submit-docker = """
...
sbatch \
--wait \
-J=${job_name} \
-D ${cwd} \
-o ${out}.sbatch \
-e ${err}.sbatch \
-t ${runtime_minutes} \
-c ${cpu} \
--mem=${memory_mb} \
--wrap "singularity exec --containall --bind ${cwd}:${docker_cwd} docker://${docker} ${job_shell} ${docker_script}"
"""
But both options are way less than ideal and the choice of 60
might not be sufficient for all NFS configurations. If anybody wants to try to trigger this issue on an NFS shared filesystem setup, the following WDL should do the trick:
version 1.0
workflow main {
scatter (idx in range(256)) {
call main {
input:
i = idx
}
}
output { Array[Int] n = main.n }
}
task main {
input {
Int i
}
command <<<
set -euo pipefail
echo ~{i*i}
>>>
output {
Int n = read_int(stdout())
}
runtime {
docker: "debian:stable-slim"
}
}
After being able to do some testing with a collaborator that is able to reproduce the problem, I was able to gather that:
-
script-epilogue = "sleep 5 && sync"
worked -
script-epilogue = "ls -l stdout stderr && sync"
worked -
script-epilogue = "ls && sync"
failed
with the ls -l
suggestion coming from here. I am guessing here that tee
might not be writing to stdout/stderr fast enough and by the time the sync
command (which is what script-epilogue
is set to default) is run, the stdout/stderr file might still be incomplete. So if I understand this correctly, while this is related to an NFS synchronization problem, it is not strictly an NFS synchronization problem but rather a synchronization problem between tee
and sync
tee
will cache the standard output of the program into a buffer, after using sync
, it will brush the data from the buffer to disk, after that, nfs will synchronize to the remote service (nfs itself has a delay of a few seconds); so no matter whether it is tee
and sync
, or nfs there may be a problem, the best thing is to turn therc.tmp
to rc
file operation to give a delay of a few seconds is to do it in the case of not changing the source code. It's a good idea to delay the rc.tmp
to rc
file operation for a few seconds without changing the source code.