wdl icon indicating copy to clipboard operation
wdl copied to clipboard

Clarifying "Absence of presence" and "presence of absence" (optional -> non optional task input with default)

Open illusional opened this issue 3 years ago • 13 comments

Essentially, if I have some workflow and task that connections an optional value to a non-optional value. This should be a validation failure right?

workflow SmallWorkflow {
  input {
    String? myString
  }
  call Print {
    input:
      inp=myString
  }
}

task Print {
  input {
    String inp = "default"
  }
  command <<<
    echo '{inp}'
  >>>
  output {
    String out = read_string(stdout())
  }
}

How about if you slightly change the workflow to have:

task Print {
  input {
    String? inp = "default"
  }
  #...
}

My interpretation is:

  • providing a value that is null => value becomes null
  • not providing a value (ie, it's not present in the input: map => default is applied

But I don't think the spec clarifies this, and looks like there is ambiguity in Cromwell and MiniWDL.

Edit: I believe the way to apply a default in a task is to do:

task Print {
  input {
    String? inp
    String _inp = select_first([inp, "default"])
  }
  #...
}

illusional avatar May 12 '21 07:05 illusional

So this is an interesting edgecase which the Section on Optionals and Defaults does not fully get into. Just to be clear, this is working fine despite both of us thinking it should technically be a validation error correct?

If that statement is true, then what I think is happening is that since we are defining a default for the task input, when we plumb down the optional the engine will either use the optional value, or the default. this feels... unintuitive to me?

patmagee avatar May 12 '21 13:05 patmagee

I don't think this should be a validation error. When the workflow input doesn't have a value then no value is given to the call's input and, thus, the default gets used.

DavyCats avatar May 12 '21 14:05 DavyCats

I haven't checked this exact example, but the GATK-sv pipeline has this, it validates in Cromwell but fails in MiniWDL.

IMO it should fail to validate, WDL has a mechanism for applying defaults, and supplying a null to an input with a default should be possible.

illusional avatar May 12 '21 14:05 illusional

supplying a null to an input with a default should be possible.

~~Isn't this why the None literal was added in 1.1?~~

Nevermind, I think you're correct. Optional inputs would have a None value by default, so those would also pass the None value along to the calls.

DavyCats avatar May 12 '21 14:05 DavyCats

The spec seems clear to me. The type of an argument is separate from it having a default. You cannot pass a String? input to a task that requires a String. You can declare an input that has a default as an optional type or not. Either way, its type must match the type of its caller (or it must be implicitly convertible from the caller type). I guess one source of confusion is that the term "optional" is used for both types and default values but the word means different things in those places.

Regarding

    String? inp
    String _inp = select_first([inp, "default"])

this seems totally fine to me, if you need to convert an optional input to a non-optional type.

The reason why you'd want to support a defaulted input with a non-optional type is because you don't know if the type in the caller task has value or not. If the caller provides the input, and it has no value, then the value in the non-optional type would be invalid. The only way to be sure is to convert it to a non-optional type via select_first().

pshapiro4broad avatar May 12 '21 21:05 pshapiro4broad

I tried out your example, after adding version 1.0 and fixing a typo. miniwdl check rejects it (correctly), cromwell allows it (incorrectly). To me this looks like a bug in cromwell. It looks like the presence of a default value is affecting the type cromwell creates for the task input.

pshapiro4broad avatar May 12 '21 21:05 pshapiro4broad

miniwdl has a command-line flag --no-quant-check that relaxes some of the static checks and would probably make this pass validation.

However, I'm not certain whether it would then exhibit the same runtime behavior as Cromwell. In particular, if task Print receives an explicit binding of inp to None, miniwdl would consider that a runtime error since its type is not optional; but another interpretation might use the default value in that case.

By analogy, consider this python function

def f(x=1):
  return x

f() == 1 but f(x=None) == None. "Absence of presence" isn't the same as "presence of absence" (in python)

mlin avatar May 13 '21 22:05 mlin

"Absence of presence" isn't the same as "presence of absence" (in python)

This is an extremely elegant explanation of this problem. I've been looking for a phrase to describe this for ages!

illusional avatar May 13 '21 22:05 illusional

BTW, I showed the python example just to clarify the distinction, not because I think we necessarily have to follow its model.

I don't have a big problem with making the spec say that a task/workflow input String inp = "default" is allowed to receive None and inp takes the default value in that case. It leaves this rather subtle distinction that with String? inp = "default", inp would take None given the same inputs; there'd be some papercuts but not huge ones, and it's convenient compared to having to use select_first().

mlin avatar May 14 '21 07:05 mlin

@mlin I like your python example and thinks it demonstrates the expected behaviour. If a declaration is explicitly bound to None, one would expect declaration to exist in one of two states 1) it has None as its value, or 2) it throws an error because it's not optional. There are nonexamples in other languages I can think of where an explicit assignment to the null value signifies that a default value should be used.

patmagee avatar May 20 '21 02:05 patmagee

CWL has the behaviour, a null value will trigger the default to be applied:

The default value to use for this parameter if the parameter is missing from the input object, or if the value of the parameter in the input object is null. Default values are applied before evaluating expressions (e.g. dependent valueFrom fields) [source]

illusional avatar May 20 '21 02:05 illusional

I'm sensitive to the inconvenience caused by enforcing the distinction; namely, to have optional inputs that are passed through from workflow to task, to apply a default one would either need (i) an auxiliary declaration with select_first() (as @illusional's example at the top), or (ii) propagate the default values to the workflow level, which would mix concerns. Avoiding that seems like the lesser of two evils. I plan to experiment it with miniwdl, and will propose some spec clarification if it seems to work out.

mlin avatar May 20 '21 02:05 mlin

Opened #464 to clarify in the spec. It would be good to confirm that the proposed clarification matches what Cromwell actually does...

mlin avatar May 21 '21 07:05 mlin