miniwdl
miniwdl copied to clipboard
Inconsistent behavior with optional file types when coercing from string
When a file is declared as optional, the value should be null if the file does not exist. In an array of optional files, this should mean those nonexistent file entries should become null.
When coercing from string to file, this check for file existence seems to be inconsistent depending on how the WDL is written:
version 1.1
workflow testWorkflow {
input {
}
call testTask
output {
Array[File?] array_in_output = testTask.array_in_output
Int len_in_output = testTask.len_in_output
Array[File?] array_in_body_out = testTask.array_in_body_out
Int len_in_body_out = testTask.len_in_body_out
Array[File?] array_in_input_out = testTask.array_in_input_out
Int len_in_input_out = testTask.len_in_input_out
}
}
task testTask {
input {
Array[File?] array_in_input = ["example1.txt", "example2.txt"]
Int len_in_input = length(select_all(array_in_input))
}
command <<<>>>
Array[File?] array_in_body = ["example1.txt", "example2.txt"]
Int len_in_body = length(select_all(array_in_body))
output {
Array[File?] array_in_output = ["example1.txt", "example2.txt"]
Int len_in_output = length(select_all(array_in_output))
Array[File?] array_in_body_out = array_in_body
Int len_in_body_out = len_in_body
Array[File?] array_in_input_out = array_in_input
Int len_in_input_out = len_in_input
}
}
With MINIWDL__FILE_IO__ALLOW_ANY_INPUT=True miniwdl run test.wdl
, this returns:
{
"dir": "/home/heaucques/Documents/wdl-conformance-tests/20240626_184902_testWorkflow",
"outputs": {
"testWorkflow.array_in_body_out": [
null,
null
],
"testWorkflow.array_in_input_out": [
null,
null
],
"testWorkflow.array_in_output": [
null,
null
],
"testWorkflow.len_in_body_out": 2,
"testWorkflow.len_in_input_out": 2,
"testWorkflow.len_in_output": 0
}
}
All of len_in_body_out
, len_in_input_out
, and len_in_output
should be 0, but when processed inside the task body/input, select_all runs on the string representation instead of the file representation; only processing it in the output does it check that those file paths exist.
String to file coercion also behaves differently depending if it is within a task or a workflow; the above WDL is mostly in a task, as the same code does not work in a workflow:
version 1.1
workflow testWorkflow {
input {
}
output {
Array[File?] array_in_output = ["example1.txt", "example2.txt"]
Int len_in_output = length(select_all(array_in_output))
}
}
2024-06-26 18:53:32.953 wdl.w:testWorkflow workflow start :: name: "testWorkflow", source: "test.wdl", line: 2, column: 1, dir: "/home/heaucques/Documents/wdl-conformance-tests/20240626_185332_testWorkflow"
2024-06-26 18:53:32.954 wdl.w:testWorkflow miniwdl :: version: "v1.12.0", uname: "Linux pop-os 6.8.0-76060800daily20240311-generic #202403110203~1715181801~22.04~aba43ee SMP PREEMPT_DYNAMIC Wed M x86_64"
2024-06-26 18:53:32.954 wdl.w:testWorkflow task thread pool initialized :: task_concurrency: 8
2024-06-26 18:53:32.968 wdl.w:testWorkflow visit :: node: "output-array_in_output", values: {"array_in_output": ["example1.txt", "example2.txt"]}
2024-06-26 18:53:32.968 wdl.w:testWorkflow visit :: node: "output-len_in_output", values: {"len_in_output": 2}
2024-06-26 18:53:32.995 wdl.w:testWorkflow workflow testWorkflow (test.wdl Ln 2 Col 1) failed :: dir: "/home/heaucques/Documents/wdl-conformance-tests/20240626_185332_testWorkflow", error: "InputError", message: "workflow output uses nonexistent file/directory: example1.txt", node: "outputs"
2024-06-26 18:53:32.995 wdl.w:testWorkflow aborting workflow
2024-06-26 18:53:32.995 miniwdl-run workflow output uses nonexistent file/directory: example1.txt :: error: "InputError", node: "outputs", dir: "/home/heaucques/Documents/wdl-conformance-tests/20240626_185332_testWorkflow"