wdl icon indicating copy to clipboard operation
wdl copied to clipboard

map contains_key() function

Open TedBrookings opened this issue 5 years ago • 5 comments

This is related to issue #174 and is a solution suggested in that thread. It would empower terser more maintainable WDLs if we could check if a map has a certain key. In my use case the map would USUALLY be totally empty, but has a very large number of possible keys.

input {
  Map[String, MyOverrideStruct] = overrides_map
}
MyOverrideStruct? runtime_override = if contains_key(overrides_map, "MyTask") then overrides_map["MyTask"] else null;
call MyTask {
 input:
   foo="foo",
  runtime_override=runtime_override
}

The use case is that the SV team is building a big pipeline that calls a lot of complex 3rd-party code. It is inevitable that a few difficult samples will cause some tasks to crash due to e.g. out of memory error. To enable pushing these difficult samples through the pipeline, we pass structures with optional overrides to the task, allowing us to bump up memory, increase disk size, etc. However this induces a scaling problem that the overrides for EVERY relevant task must be passed through sub-workflows, making our WDLs look mainly like a long list of optional overrides. The present alternative would be to create an enormous struct of overrides, which could at least hide some of the mess, but still presents an annoying maintenance problem: if someone renames tasks, adds them, removes them then the struct file needs to be changed too.

Draft implementation: https://github.com/openwdl/wdl/tree/305-contains-key

TedBrookings avatar Apr 04 '19 14:04 TedBrookings

@TedBrookings This has come up a few times in the past. If it is something you'd like to see I'd encourage a PR against the spec

geoffjentry avatar Apr 07 '19 16:04 geoffjentry

I had the same issue when writing my WDL and, while I also think a contains_key() function would be a great addition to WDL, the following code obviates the issue for now:

scatter (key in keys(overrides_map)) { Boolean? tmp_arr = if key =="MyTask" then true else None }
MyOverrideStruct? runtime_override = if length(select_all(tmp_arr))>0 then overrides_map["MyTask"] else None

Notice though that keys() is not present in version 1.0 and Cromwell does not support version 1.1 at the time of the writing of this comment. However, Cromwell partially supports version development and it does support the function keys(). I used this hack in many of my WDLs. Do notice though that if you have to use this within a scatter loop you will be in a situation where you are running a scatter within a scatter which forces Cromwell to spawn sub-workflows, a situation I always try to avoid

freeseek avatar Sep 05 '22 19:09 freeseek

There is actually a better way to imitate an is_in(X, Array[X]) function without having to use a scatter:

version development
workflow main {
  input {
    Array[String] array = ["a", "b", "c"]
    String element = "d"
  }
  output {
    Boolean is_in = length(collect_by_key(zip(flatten([array,[element]]),range(length(array)+1)))[element])>1
  }
}

To imitate a map_contains_key(X, Map[X,Y]) function, you can use the function keys() to extract an Array from a Map.

freeseek avatar Feb 10 '23 16:02 freeseek

Proposed signatures:

Boolean contains_key(Map[P, Y], P) Boolean contains_Key(Map[P?, Y], (P | None))

jdidion avatar Mar 23 '23 14:03 jdidion

As mentioned in #596, I'd like to consider adding the following signature:

Boolean contains_key(Object, String)

jdidion avatar Dec 12 '23 22:12 jdidion