wdl
wdl copied to clipboard
map contains_key() function
This is related to issue #174 and is a solution suggested in that thread. It would empower terser more maintainable WDLs if we could check if a map has a certain key. In my use case the map would USUALLY be totally empty, but has a very large number of possible keys.
input {
Map[String, MyOverrideStruct] = overrides_map
}
MyOverrideStruct? runtime_override = if contains_key(overrides_map, "MyTask") then overrides_map["MyTask"] else null;
call MyTask {
input:
foo="foo",
runtime_override=runtime_override
}
The use case is that the SV team is building a big pipeline that calls a lot of complex 3rd-party code. It is inevitable that a few difficult samples will cause some tasks to crash due to e.g. out of memory error. To enable pushing these difficult samples through the pipeline, we pass structures with optional overrides to the task, allowing us to bump up memory, increase disk size, etc. However this induces a scaling problem that the overrides for EVERY relevant task must be passed through sub-workflows, making our WDLs look mainly like a long list of optional overrides. The present alternative would be to create an enormous struct of overrides, which could at least hide some of the mess, but still presents an annoying maintenance problem: if someone renames tasks, adds them, removes them then the struct file needs to be changed too.
Draft implementation: https://github.com/openwdl/wdl/tree/305-contains-key
@TedBrookings This has come up a few times in the past. If it is something you'd like to see I'd encourage a PR against the spec
I had the same issue when writing my WDL and, while I also think a contains_key()
function would be a great addition to WDL, the following code obviates the issue for now:
scatter (key in keys(overrides_map)) { Boolean? tmp_arr = if key =="MyTask" then true else None }
MyOverrideStruct? runtime_override = if length(select_all(tmp_arr))>0 then overrides_map["MyTask"] else None
Notice though that keys()
is not present in version 1.0
and Cromwell does not support version 1.1
at the time of the writing of this comment. However, Cromwell partially supports version development
and it does support the function keys()
. I used this hack in many of my WDLs. Do notice though that if you have to use this within a scatter loop you will be in a situation where you are running a scatter within a scatter which forces Cromwell to spawn sub-workflows, a situation I always try to avoid
There is actually a better way to imitate an is_in(X, Array[X])
function without having to use a scatter:
version development
workflow main {
input {
Array[String] array = ["a", "b", "c"]
String element = "d"
}
output {
Boolean is_in = length(collect_by_key(zip(flatten([array,[element]]),range(length(array)+1)))[element])>1
}
}
To imitate a map_contains_key(X, Map[X,Y])
function, you can use the function keys()
to extract an Array from a Map.
Proposed signatures:
Boolean contains_key(Map[P, Y], P)
Boolean contains_Key(Map[P?, Y], (P | None))
As mentioned in #596, I'd like to consider adding the following signature:
Boolean contains_key(Object, String)