wdl
wdl copied to clipboard
engine function to map over an Array
There's currently one main function (prefix) that maps over all the elements in an array. When one wants to do some other kind of mapping/manipulation over all the elements in an array, they can do so by using a scatter block:
Array[String] arr = ['a', 'b', 'c']
scatter (i in arr) {
String append_suffix = i + '_1'
}
Array[String] arr_with_suffix = append_suffix # ['a_1', 'b_1', 'c_1']
It would be great if there was an engine function that allowed for running functions over an Array type without requiring the boilerplate of a scatter block.
Just a thought, what would you think of a list comprehension expression, eg something like the python syntax? (just throwing out some ideas here):
# 1:
Array[String] arr_with_suffix = [ a in arr: a + "_1" ]
# 2:
Array[String] arr_with_suffix = [ arr: a => a + "_1" ]
# 3:
Array[String] arr_with_suffix = [ map arr: a => a + "_1" ]
# 4:
Array[String] arr_with_suffix = [ a + "_1" for a in arr ]
A list comprehension sounds exactly like whats required, and I could work with any of that syntax.
Do you imagine something like this would be allowed?
Array[String] arr_with_suffix = [ a + "_1" for a in arr if (true) ]
@ruchim if we go down this route, I don't see why not!
Another possible suggestion is to follow the path of vectorized languages like R or MATLAB (which in my experience are the two most common languages used by bioinformatics researches), and simply have functions be Array-aware. That is, if a scalar function that takes scalar input is run on an array, a new array is returned with each value being the result of that scalar function being run on the corresponding value of the input array (implicit map).
I raised an issue which ended up being a duplicate but wanted to quote @jtratner answer from here:
scatter(x in arr) { String names = basename(x) } call mytask (input: names=names}
and then names outside of scatter are an array of strings.
Super convenient for array of maps of file with their indexes, i.e. array of bams and their bai.
Hey @patmagee, is there any way to gain some momentum on this feature request? Is list comprehension an acceptable approach? I'm happy to target something against a spec and implement it in MiniWDL maybe?
I like the addition but I'd opt for the less burdensome (on the user) approach suggested by @dheiman where if you pass an array of strings to something like basename
or a +
, you'd get the same array back, in the same order, but with the operation performed. That way folks who aren't python users don't need to learn what list comprehension is.