wdl icon indicating copy to clipboard operation
wdl copied to clipboard

Proposal: add required runtime-like section for execution engine

Open vsmalladi opened this issue 2 years ago • 7 comments

In the new spec there is a proposal to make a stricter runtime section and provide an optional hints section that can be the execution engine but won't hinder the execution of workflow. However, I feel like this fails to navigate the more common senario where there are cloud/execution engine specs that must be present to be run.

My proposal is to extend the runtime section with nested cloud/execution engine specs which would be required for running on their system. This will allow at least minimum requirements are meet to run.

task foo {
  .... 
  runtime {
    gcp: {
    # gcp specific 
    ...
    }
    aws: {
    # aws specific 
    ...
    }
    azure: {
    # azure specific 
      ...
    }
    cromwell: {
      # cromwell specific 
      ...
    }
    miniwdl: {
      # miniwdl specific
      ...
    }
  }
}

The execution engine can have a command line input a runtime combination names to read and validate the options, and if not given will skip all of them and use the default runtime options to run spec. An example of for some are below:

Cromwell

java -jar /path/to/cromwell.jar run --runtime cromwell,gcp.

miniWDL

miniwdl run --runtime-configuration miniwdl,azure

vsmalladi avatar Dec 01 '21 22:12 vsmalladi

These are intended to be put in hints section. Please note that the changes are not proposed, they are approved and merged into the development spec.

jdidion avatar Dec 01 '21 22:12 jdidion

It might help if you give a concrete example. What types of cloud- or executor-specific attributes need to be requirements rather than hints?

jdidion avatar Dec 01 '21 22:12 jdidion

@jdidion I understand the hints section has been approved and merged and I want to apologize that my example wanted to put this in the runtime, that was not my intention

From the spec's it seems as though this section definition a task execution never fails due to the inability of the execution engine to recognize or satisfy a hint. To me that means if there are any required cloud- or executor-specific attributes that are missing in the hints, the workflow will fail to run, but this contradicts the definition of hints.

I am looking across all cloud- or executor- attributes to see what is required and will add to this thread.

vsmalladi avatar Dec 02 '21 20:12 vsmalladi

I think that the nested cloud/execution engine specs runs better. On aws they use 'disks' to specify the storage, however, we use 'disk' in the standard WDL runtime to run on k8s. There should always be a 'disk' specification on different backends.

JaylanLiu avatar Aug 23 '22 01:08 JaylanLiu

Disk is a TES specific backend, while GCP, AWS use Disks for MiniWDL and Cromwell.

vsmalladi avatar Aug 23 '22 02:08 vsmalladi

the disk field has been standardized in a future (so far perpetually in the future) 2.0 release (see https://github.com/openwdl/wdl/pull/315)

patmagee avatar Aug 26 '22 20:08 patmagee

@patmagee I still see 'disks' however cromwell/TES expects 'disk'

vsmalladi avatar Aug 26 '22 21:08 vsmalladi

@jdidion and @patmagee I would like to restart this conversation. Below are some examples that required or expected in some form per task in the wdl.

AWS uses queueArn , awsBatchRetryAttempts DNANexus use dx_instance_type GCP gpuCount, gpuType, and nvidiaDriverVersion

Some have been taken out as default runtime environments like Cromwell, but that doesn't offer task level support. Whats the best way that we can support these options and extensions without having to constantly changing wdl for these task specifc execution engine runtime attributes?

vsmalladi avatar Nov 04 '22 19:11 vsmalladi

I think our policy is to try to standardize things which are fairly common, at least conceptually, and letting the execution engine interpret them as it wants.

For example, we have the reserved max_retries which, when the task is run by AWS Batch, can be interpreted as awsBatchRetryAttempts.

As another example, we've added a gpu hint in WDL 1.2, which can accept an arbitrary specification string that can be interpreted differently by each backend. Along with this, we've specified that backend-specific hints override task-level hints, so that a different hint value can be specified for each backend.

For things that are specific to a particular implementation engine or backend, hints is the most appropriate place to put those. In WDL 1.2 there is a Compute Environments section. I have no problem with standardizing the names we use for the different backends and backend-specific hints but I would want that to be championed by someone in the community (preferably someone representing the backend provider in question), and I'd want to make sure we look at all the existing implementations to see if there are existing hints we can leverage or if we need to reconcile multiple implementations that do the same thing in different ways.

@vsmalladi if you'd like to propose additional changes I think WDL 1.2 is a good opportunity to do so. I'd recommend opening issues with more specific requests.

jdidion avatar Feb 06 '24 22:02 jdidion

Great will put in some specifics.

vsmalladi avatar Feb 07 '24 00:02 vsmalladi