nextflow icon indicating copy to clipboard operation
nextflow copied to clipboard

symlinks for staged files in directories are not removed in .command.run

Open nick-youngblut opened this issue 1 year ago • 7 comments

Bug report

For a failed job in my Nextflow pipeline, I'm manually running bash .command.run and I'm getting ln: failed to create symbolic link 'DIRECTORY_NAME/FILE_NAME.txt': File exists.

The nxf_stage() function includes:

nxf_stage() {
    true
    # stage input files
    mkdir -p 164164 && ln -s /home/nickyoungblut/tmp/work/ad/d0abcbad4c7b9137844e3ba48c8af4/KAPA_mRNA-enrichment_HumanRefRNA_500ng_1e-2dilution_20240417_C01_R1_001/summary.txt 164164/summary.txt
    mkdir -p 6262 && ln -s /home/nickyoungblut/tmp/work/53/f75d0ff2aa376187fafae661a5b400/DJv3_NT1_ctrl_rep1_031524_R1_001/fastqc_data.txt 6262/fastqc_data.txt
    mkdir -p 284284 && ln -s /home/nickyoungblut/tmp/work/be/b0b67f565d4ae5a0230e453acaa236/DJv2_FTH1_kd_rep2_031524_R2_001/fastqc_data.txt 284284/fastqc_data.txt 
    [...]
}

The symlinks are not removed via rm -f prior to recreating them in the nxf_stage() function, and ln -s is used instead of ln -sf. This results in the error when manually re-running .command.run. This make troubleshooting failed jobs harder, since I manually have to delete existing symlinks or comment-out all of the ln -s commands in nxf_stage().

This issue does not occur for files not in staged directories, just for mkdir -p new_directory && ln -s new_directory/new_file.txt.

Expected behavior and actual behavior

See above

Steps to reproduce the problem

This should occur for any pipeline that creates staged files in directories: mkdir -p new_directory && ln -s new_directory/new_file.txt

Program output

See above

Environment

  • Nextflow version: 23.10.1
  • Java version: openjdk 21
  • Operating system: Linux
  • Bash version: 5.2.15

Additional context

See this slack thread

nick-youngblut avatar May 03 '24 16:05 nick-youngblut

This is likely because you have many staged files, see here

https://github.com/nextflow-io/nextflow/blob/aa9e127373de3bc0b4b78640279336cdd6d003aa/modules/nextflow/src/main/groovy/nextflow/executor/SimpleFileCopyStrategy.groovy#L122-L133

pditommaso avatar May 06 '24 08:05 pditommaso

Thanks @pditommaso for pointing that out! What is the problem with including possibly a few 1000 more lines in the runner script?

nick-youngblut avatar May 06 '24 14:05 nick-youngblut

It's explained in the comment. To contain the script file size. You can delete all symlink using a Bash oneliner like find . -type l -delete or something similar

pditommaso avatar May 07 '24 12:05 pditommaso

Why does the file size need to be contained to <100 lines of removing symlinks? Extending to 1000's of lines will not add much size to the file.

You can delete all symlink using a Bash oneliner like find . -type l -delete or something similar

Why not just use find . -type l -delete instead of removing each symlink individually in the runner script?

nick-youngblut avatar May 07 '24 14:05 nick-youngblut

lol. need to think it there could be other links. @bentsherman opinion?

pditommaso avatar May 07 '24 15:05 pditommaso

need to think it there could be other links

I thought all symlinks were (re)created by the runner script, but maybe I'm mistaken?

nick-youngblut avatar May 07 '24 15:05 nick-youngblut

Deleting all links should be fine, I can't think of any other links that are created. But Nick also suggested using ln -sf instead of deleting the links, maybe that would be better

bentsherman avatar May 07 '24 15:05 bentsherman

I just hit this problem too when using nextflow -resume:

Command exit status:
  1

Command output:
  (empty)

Command wrapper:
  ln: failed to create symbolic link 'prop_summary.json': File exists

not sure if I understand the comments above, this just looks like a bug?

jamesamcl avatar Sep 09 '24 10:09 jamesamcl