error if symlinks with absolute paths are present inside Directory outputs
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
inputs: []
baseCommand: [ bash, -c ]
arguments:
- "mkdir foo; echo 42 > foo/bar; ln -s $PWD/foo/bar foo/baz"
outputs:
result:
type: Directory
outputBinding:
glob: foo
Short term workaround: use --copy-outputs but that leaves intermediate files laying around afterwards.
Example error:
$ TMPDIR=$PWD cwltool --debug tests/symlinks.cwl
INFO /home/michael/ebi/env/bin/cwltool 3.1.20210623153106
INFO Resolved 'tests/symlinks.cwl' to 'file:///home/michael/cwltool/tests/symlinks.cwl'
DEBUG Parsed job order from command line: {
"id": "tests/symlinks.cwl"
}
DEBUG [job symlinks.cwl] initializing from file:///home/michael/cwltool/tests/symlinks.cwl
DEBUG [job symlinks.cwl] {}
DEBUG [job symlinks.cwl] path mappings is {}
DEBUG [job symlinks.cwl] command line bindings is [
{
"position": [
-1000000,
0
],
"datum": "bash"
},
{
"position": [
-1000000,
1
],
"datum": "-c"
},
{
"position": [
0,
0
],
"datum": "mkdir foo; echo 42 > foo/bar; ln -s $PWD/foo/bar foo/baz"
}
]
DEBUG [job symlinks.cwl] initial work dir {}
INFO [job symlinks.cwl] /home/michael/cwltool/l6s28ubx$ bash \
-c \
'mkdir foo; echo 42 > foo/bar; ln -s $PWD/foo/bar foo/baz'
DEBUG Could not collect memory usage, job ended before monitoring began.
INFO [job symlinks.cwl] completed success
DEBUG [job symlinks.cwl] outputs {
"result": {
"location": "file:///home/michael/cwltool/l6s28ubx/foo",
"basename": "foo",
"nameroot": "foo",
"nameext": "",
"class": "Directory"
}
}
DEBUG [job symlinks.cwl] Removing input staging directory /home/michael/cwltool/tp94m_vi
DEBUG [job symlinks.cwl] Removing temporary directory /home/michael/cwltool/2zomy12a
DEBUG Moving /home/michael/cwltool/l6s28ubx/foo to /home/michael/cwltool/foo
DEBUG Moving /home/michael/cwltool/l6s28ubx/foo/bar to /home/michael/cwltool/foo/bar
DEBUG Moving /home/michael/cwltool/l6s28ubx/foo/bar to /home/michael/cwltool/foo/baz
ERROR Unhandled error:
[Errno 2] No such file or directory: '/home/michael/cwltool/l6s28ubx/foo/bar'
Traceback (most recent call last):
File "/usr/lib/python3.9/shutil.py", line 806, in move
os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: '/home/michael/cwltool/l6s28ubx/foo/bar' -> '/home/michael/cwltool/foo/baz'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/michael/cwltool/cwltool/main.py", line 1248, in main
(out, status) = real_executor(
File "/home/michael/cwltool/cwltool/executors.py", line 59, in __call__
return self.execute(process, job_order_object, runtime_context, logger)
File "/home/michael/cwltool/cwltool/executors.py", line 155, in execute
self.final_output[0] = relocateOutputs(
File "/home/michael/cwltool/cwltool/process.py", line 400, in relocateOutputs
stage_files(pm, stage_func=_relocate, symlink=False, fix_conflicts=True)
File "/home/michael/cwltool/cwltool/process.py", line 296, in stage_files
stage_func(entry.resolved, entry.target)
File "/home/michael/cwltool/cwltool/process.py", line 371, in _relocate
_relocate(dir_entry.path, fs_access.join(dst, dir_entry.name))
File "/home/michael/cwltool/cwltool/process.py", line 373, in _relocate
shutil.move(src, dst)
File "/usr/lib/python3.9/shutil.py", line 820, in move
copy_function(src, real_dst)
File "/usr/lib/python3.9/shutil.py", line 435, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/usr/lib/python3.9/shutil.py", line 264, in copyfile
with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: '/home/michael/cwltool/l6s28ubx/foo/bar'
The fix and/or its test seems flaky, so I reverted it in https://github.com/common-workflow-language/cwltool/pull/1483
Hi, any chance this can be fixed in the future? We have a CommandLineTool to move all intermediary files/folders to the destination path using bash script at Runtime:
requirements:
InlineJavascriptRequirement: {}
ShellCommandRequirement: {}
InitialWorkDirRequirement:
listing:
- entryname: mv.sh
entry: |-
shift;
mkdir $(inputs.destination);
# Move each file individually, ignoring non-existing files
for file in $@; do
if [ -e "$file" ]; then
mv -n "$file" "$(inputs.destination)"
fi
ls $(inputs.destination)
done
inputs:
files:
type: File[]?
inputBinding:
position: 2
folders:
type: Directory[]?
inputBinding:
position: 3
destination:
type: string
inputBinding:
position: 1
baseCommand: [bash, -x, mv.sh]
outputs:
results:
type: Directory
outputBinding:
glob: $(inputs.destination)
The above module does not work most of the time, and the output directory has only broken symlinks pointing to the file in tmpdir instead of real generated files, and these tmpdir does not exist. Sometimes the cwltool runs finally with a success, and from the log the mv process looks fine without error, but still in the outputs folder (destination) we only have broken symlinks. Typical error such as:
ERROR Unhandled error, try again with --debug for more information:
[Errno 2] No such file or directory: '/data/home/username/tmp/531ybmnk/aaaaaaa_illuminaQC_illumina_filtered_bbduk-summary.txt'
FYI, It is a remake of an older cwl script for the same purpose, we did it to minimize use of JS Expression as suggested in workflow description, and the old one use Javascript Expression:
requirements:
- class: InlineJavascriptRequirement
inputs:
files:
type: File[]?
folders:
type: Directory[]?
destination:
type: string
expression: |
${
var array = []
if (inputs.files != null) {
array = array.concat(inputs.files)
}
if (inputs.folders != null) {
array = array.concat(inputs.folders)
}
var r = {
'results':
{ "class": "Directory",
"basename": inputs.destination,
"listing": array
}
};
return r;
}
outputs:
results:
type: Directory
Thanks!