Should files imported from JSON be relative to the working directory or relative to the JSON file
Given a directory structure like this:
.
├── example.txt
└── nested
├── example.txt
├── test.json
└── test.wdl
If nested/test.wdl is executed with a JSON input of nested/test.json where test.wdl and test.json are:
version 1.1
workflow wf {
input {
File f
}
output {
File f_out = f
}
}
{
"wf.f": "example.txt"
}
Which example.txt file should the workflow read? Should it be the one in the root directory example.txt or the nested directory nested/example.txt? Or should it be relative to the WDL file itself?
Also, if the input JSON is a URL instead, for example:
miniwdl run https://example.com/test.wdl -i https://example.com/test.json
Assuming the server has the same WDL/JSON files for test.wdl and test.json, should example.txt in this case refer to the url https://example.com/example.txt or a local file on the machine?
It looks like in https://github.com/openwdl/wdl/pull/670/files#diff-7426c9e3a694ca6015df5f98637912975f2edea23270203ee89a8bdeed246ee0R882 a note is being added that as of WDL 2.0 it will no longer be possible to have a relative path in an input file.
That does kind of break how we are shipping example input files and the corresponding test data with our workflows in our Git repo, though.
The prohibition would be on using relative paths in path literals in WDL files. For your use case it would probably be better to allow the user to pass a relative path on the command line or in an input.json rather than hard-code a relative path in the WDL.
To answer the original question - I think the only thing that can work is for paths to be relative to the folder that contains the .json input file. This is an implementation detail left up to the execution engine, but I think it should at least be suggested in the spec to handle relative paths in this way.
I also think that makes the most sense.
Right now Toil is interpreting the paths relative to the directory you are in when you launch the workflow, and then we have special magic code for when the JSON comes from a Github URL to interpret them as relative to the repository root. This is ugly and doesn't make a lot of sense; I think it happens for free because we see the path and we just try and open it.
Hey folks, this appears to have been answered, so I'm going to close the issue. Don't hesitate to open another one if you have more questions!