cwltool icon indicating copy to clipboard operation
cwltool copied to clipboard

Inconsistent behaviour in packing

Open alexiswl opened this issue 3 years ago • 4 comments

Expected Behavior

Tell us what should happen

Consistency in types with the out section of steps

Actual Behavior

Tell us what happens instead

Multiple variants in the out component of steps depending on which id attributes are set in the main workflow

Workflow Code

from https://github.com/common-workflow-language/cwl-v1.1/blob/main/tests/count-lines1-wf.cwl

#!/usr/bin/env cwl-runner
class: Workflow
cwlVersion: v1.1

inputs:
  file1:
    type: File

outputs:
  count_output:
    type: int
    outputSource: step2/output

steps:
  step1:
    run: wc-tool.cwl
    in:
      file1: file1
    out: [output]

  step2:
    run: parseInt-tool.cwl
    in:
      file1: step1/output
    out: [output]

Now let's pack it and then check on the out of the steps

cwltool --pack count-lines1-wf.cwl | jq '.["$graph"][] | select(.class == "Workflow") | .steps[] | .out'

Yields

[
  "#main/step1/output"
]
[
  "#main/step2/output"
]

Now let's add the id attribute to the workflow

#!/usr/bin/env cwl-runner
class: Workflow
cwlVersion: v1.1

id: count-lines-1-wf

inputs:
  file1:
    type: File

outputs:
  count_output:
    type: int
    outputSource: step2/output

steps:
  step1:
    run: wc-tool.cwl
    in:
      file1: file1
    out: [output]

  step2:
    run: parseInt-tool.cwl
    in:
      file1: step1/output
    out: [output]

We run the same command again:

cwltool --pack count-lines1-wf.cwl 2>/dev/null | jq '.["$graph"][] | select(.class == "Workflow") | .steps[] | .out'

and get

[
  "#/step1/output"
]
[
  "#/step2/output"
]

Yet if we turn the out list into id: out then we have a different ids, note #/stepx is changed to #stepx instead.

#!/usr/bin/env cwl-runner
class: Workflow
cwlVersion: v1.1

id: counts-lines-1-wf

inputs:
  file1:
    type: File

outputs:
  count_output:
    type: int
    outputSource: step2/output

steps:
  step1:
    run: wc-tool.cwl
    in:
      file1: file1
    out:
      - id: output

  step2:
    run: parseInt-tool.cwl
    in:
      file1: step1/output
    out:
      - id: output

We run the same command again

$ cwltool --pack count-lines1-wf.cwl 2>/dev/null | jq '.["$graph"][] | select(.class == "Workflow") | .steps[] | .out'

but now we get

[
  {
    "id": "#step1/output"
  }
]
[
  {
    "id": "#step2/output"
  }
]

Your Environment

  • cwltool version: 3.0.20201203173111 Check using cwltool --version

alexiswl avatar Jul 02 '21 05:07 alexiswl

Related to: https://github.com/common-workflow-language/cwltool/issues/1436

alexiswl avatar Jul 02 '21 05:07 alexiswl

Any thoughts on this behaviour before we explore workarounds with the workflow engine?

ohofmann avatar Jul 12 '21 23:07 ohofmann

What's the actual effect of this? The main issue I can see is that spec says to start at "#main" so if there's no "#main" then that would be a problem.

tetron avatar Jul 13 '21 13:07 tetron

I don't think in a correct implementation of the cwl engine that there is any issue with this other than not being generally consistent. I think I'm more concerned with https://github.com/common-workflow-language/cwltool/issues/1436 where the output attribute changes depending on if id is used against not being used. These outputs are for steps and are internal variables in the workflow.

alexiswl avatar Jul 13 '21 23:07 alexiswl