wdl
wdl copied to clipboard
Disallow relative path literals in declaration expressions except in output section
This is an abridged version of the same issue from here
An computation that depends on nullable/optional files can vary depending on where in the task/workflow it is placed:
version 1.1
task t {
input {}
command <<<>>>
Array[File?] file_arr = ["example1.txt"]
output {
Int len = length(select_all(file_arr)
}
}
version 1.1
task t {
input {}
command <<<>>>
output {
Array[File?] file_arr = ["example1.txt"]
Int len = length(select_all(file_arr))
}
}
According to the spec: https://github.com/openwdl/wdl/blob/caff59db192636d9f93f3f5659eb5939f51ff877/SPEC.md?plain=1#L3880
All file outputs are required to exist, otherwise the task will fail... the value will be undefined if the file does not exist
Since only in the task output are the values allowed to be undefined, the first workflow will output len = 1 while the second workflow will output len = 0
I see the issue now. Yes it's true that defining a File variable in the output section has different semantics vs elsewhere.
I would be in favor of disallowing relative path literals in declaration expressions, except in the output section. Any use case I can think of for defining an input or private variable with a relative path would be satisfied by using a String instead. Then all path literals could be checked for existence and if they don't exist set to None if optional otherwise error. But this may be a breaking change so we'd likely deprecate it in 1.3 and remove it in 2.0.
For now we can make this distinction more clear in the spec.
This behavior was clarified in WDL v1.2 as explained in the description of #735. That being said, the changes never made it back to WDL v1.1 as best I can tell. I'll work to get the final behavior settled in #735 and then backport it to WDL v1.1/WDL v1.2 (that PR is proposed changes to WDL v1.3).