nextflow icon indicating copy to clipboard operation
nextflow copied to clipboard

Code executed using 'exec' is executed outside of work directory

Open ssadedin opened this issue 2 years ago • 4 comments

Bug report

Expected behavior and actual behavior

The documentation says that I can just replace a script command with Groovy language code to do some processing. However, when I do that, the code executes in the directory where the pipeline was executed instead of in the work directory of the process. I couldn't find a good way to make it execute in the directory of the process.

Expected Behavior: code executes in work directory the same as a script would do Actual Behaviour: code executes in directory from which nextflow was launched

Steps to reproduce the problem

nextflow.enable.dsl = 2

process hello {
    echo true

    input:
        val world

    output:
        path 'test.txt'

    exec:
        file('test.txt').text = "hello $world"

}

 workflow {
     Channel.fromList(['mars','jupiter']) | hello
 }

Program output

N E X T F L O W  ~  version 21.10.6
Launching `test.groovy` [hopeful_volhard] - revision: 3d50bfdfcf
executor >  local (2)
executor >  local (2)
[0b/fc4d48] process > hello (1) [100%] 1 of 1, failed: 1
Error executing process > 'hello (2)'

Caused by:
  Missing output file(s) `test.txt` expected by process `hello (2)`

Source block:
  file('test.txt').text = "hello $world"

Work dir:
  ...

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Apart from executing in the wrong directory, it also has the undesirable consequence that the stages overwrite each other's output files (so at the end there is a single test.txt in the same directory where I launched the pipeline, containing a semi-random output).

Additional context

I guess this could be tricky to fix if Nextflow is directly executing the code in the same JVM as the nextflow manager script, since you obviously cannot have multiple processes all changing the process cwd at the same time. Perhaps the default behaviour of file could be modified so that it returns results relative to the work directory? It won't fix everything, but a subset of use cases will work.

If not, at least some warning in the documentation could be added.

I was curious if there is a way to get the path to the actual working directory for a process so that I could set it manually.

ssadedin avatar Feb 06 '22 02:02 ssadedin

This is caused by the fact the relative path is always resolved by the Jvm against the main current launching directory.

Therefore the task work directory should be taken using the attribute task.workDir e.g.

task.workDir.resolve('test.txt').text = "hello $world"

pditommaso avatar Feb 09 '22 21:02 pditommaso

Thinking more we should look if it's possible to hijack the file invocation within the process context and resolve the relative path against task.workDir. tagging @jorgeaguileraseqera

pditommaso avatar Feb 09 '22 21:02 pditommaso

I ran into the same issue when I wanted to use some groovy to munge some collected data from a channel.

Channel
  .fromList(['a','ba','cab','done','elbow','fibers','ghastly', ''])
  .into {ch1}

process exec_to_file {
  publishDir "report"

  input:
  val consolidated from ch1.collect()

  output:
  path 'exec_ex.txt'

  exec:
  new File('./exec_ex.txt').withWriter { writer ->
    consolidated.target.each { val ->
      writer.writeLine val
    }
  }
}

Following @pditommaso's work around was successful:

  exec:
  outfile = task.workDir.resolve('exec_ex.txt')
  outfile.withWriter { writer ->
    consolidated.target.each { val ->
      writer.writeLine val
    }
  }

grosscol avatar Apr 27 '22 18:04 grosscol

As a PoC I've created this branch

https://github.com/nextflow-io/nextflow/tree/2628-code-executed-using-exec-is-executed-outside-of-work-directory

the idea is to inject the workDir into a ThreadLocal and use it in nextflow functions as file (as implemented in this branch) path etc

with this approach, dsl methods can work in the working directory of the process out of the box and custom scripts can use the suggested approach if they want

jorgeaguileraseqera avatar Aug 01 '22 18:08 jorgeaguileraseqera

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jan 16 '23 01:01 stale[bot]