miniwdl icon indicating copy to clipboard operation
miniwdl copied to clipboard

Inconsistent behavior with optional file types when coercing from string

Open stxue1 opened this issue 8 months ago • 2 comments

When a file is declared as optional, the value should be null if the file does not exist. In an array of optional files, this should mean those nonexistent file entries should become null.

When coercing from string to file, this check for file existence seems to be inconsistent depending on how the WDL is written:

version 1.1
workflow testWorkflow {
  input {
  }
  call testTask
  output {
    Array[File?] array_in_output = testTask.array_in_output
    Int len_in_output = testTask.len_in_output
    Array[File?] array_in_body_out = testTask.array_in_body_out
    Int len_in_body_out = testTask.len_in_body_out
    Array[File?] array_in_input_out = testTask.array_in_input_out
    Int len_in_input_out = testTask.len_in_input_out
  }
}

task testTask {
  input {
    Array[File?] array_in_input = ["example1.txt", "example2.txt"]
    Int len_in_input = length(select_all(array_in_input))
  }
  command <<<>>>
  Array[File?] array_in_body = ["example1.txt", "example2.txt"]
  Int len_in_body = length(select_all(array_in_body))
  output {
    Array[File?] array_in_output = ["example1.txt", "example2.txt"]
    Int len_in_output = length(select_all(array_in_output))
    Array[File?] array_in_body_out = array_in_body
    Int len_in_body_out = len_in_body
    Array[File?] array_in_input_out = array_in_input
    Int len_in_input_out = len_in_input
  }
}

With MINIWDL__FILE_IO__ALLOW_ANY_INPUT=True miniwdl run test.wdl, this returns:

{
  "dir": "/home/heaucques/Documents/wdl-conformance-tests/20240626_184902_testWorkflow",
  "outputs": {
    "testWorkflow.array_in_body_out": [
      null,
      null
    ],
    "testWorkflow.array_in_input_out": [
      null,
      null
    ],
    "testWorkflow.array_in_output": [
      null,
      null
    ],
    "testWorkflow.len_in_body_out": 2,
    "testWorkflow.len_in_input_out": 2,
    "testWorkflow.len_in_output": 0
  }
}

All of len_in_body_out, len_in_input_out, and len_in_output should be 0, but when processed inside the task body/input, select_all runs on the string representation instead of the file representation; only processing it in the output does it check that those file paths exist.

String to file coercion also behaves differently depending if it is within a task or a workflow; the above WDL is mostly in a task, as the same code does not work in a workflow:

version 1.1
workflow testWorkflow {
  input {
  }
  output {
    Array[File?] array_in_output = ["example1.txt", "example2.txt"]
    Int len_in_output = length(select_all(array_in_output))
  }
}
2024-06-26 18:53:32.953 wdl.w:testWorkflow workflow start :: name: "testWorkflow", source: "test.wdl", line: 2, column: 1, dir: "/home/heaucques/Documents/wdl-conformance-tests/20240626_185332_testWorkflow"
2024-06-26 18:53:32.954 wdl.w:testWorkflow miniwdl :: version: "v1.12.0", uname: "Linux pop-os 6.8.0-76060800daily20240311-generic #202403110203~1715181801~22.04~aba43ee SMP PREEMPT_DYNAMIC Wed M x86_64"
2024-06-26 18:53:32.954 wdl.w:testWorkflow task thread pool initialized :: task_concurrency: 8
2024-06-26 18:53:32.968 wdl.w:testWorkflow visit :: node: "output-array_in_output", values: {"array_in_output": ["example1.txt", "example2.txt"]}
2024-06-26 18:53:32.968 wdl.w:testWorkflow visit :: node: "output-len_in_output", values: {"len_in_output": 2}
2024-06-26 18:53:32.995 wdl.w:testWorkflow workflow testWorkflow (test.wdl Ln 2 Col 1) failed :: dir: "/home/heaucques/Documents/wdl-conformance-tests/20240626_185332_testWorkflow", error: "InputError", message: "workflow output uses nonexistent file/directory: example1.txt", node: "outputs"
2024-06-26 18:53:32.995 wdl.w:testWorkflow aborting workflow
2024-06-26 18:53:32.995 miniwdl-run workflow output uses nonexistent file/directory: example1.txt :: error: "InputError", node: "outputs", dir: "/home/heaucques/Documents/wdl-conformance-tests/20240626_185332_testWorkflow"

stxue1 avatar Jun 27 '24 01:06 stxue1