amazon-genomics-cli icon indicating copy to clipboard operation
amazon-genomics-cli copied to clipboard

Store Nextflow head process generated files

Open MrMarkW opened this issue 3 years ago • 5 comments

Description

Nextflow has options to generate timeline reports and dags on the completion of a workflow. They are default written to the directory the engine is running. I couldn't find a simple way to have those file written to the context/project s3 bucket.

Use Case

This useful so we can ingest the consolidated trace, timeline, report and dags from the workflow provide by nextflow.

Proposed Solution

Allow configuration to copy files in default or specific relative directory on head node to the s3 work dir bucket.

MrMarkW avatar Jan 31 '22 22:01 MrMarkW

This works for us in nf-core on AWS Batch and also with Tower; I haven't tried with Amazon Genomics CLI yet.

Perhaps having this block in your config might help?

timeline {
  overwrite = true
}
report {
  overwrite = true
}
trace {
  overwrite = true
}
dag {
  overwrite = true
}

See e.g. https://github.com/nf-core/configs/blob/master/conf/awsbatch.config#L12

heuermh avatar Mar 10 '22 02:03 heuermh

I will try this again. I also might not have been using the publishDir correctly.

MrMarkW avatar Mar 11 '22 14:03 MrMarkW

Did you get this to work?

elswob avatar Apr 05 '22 12:04 elswob

Greetings! Sorry to say but this is a very old issue that is probably not getting as much attention as it deserves. We encourage you to check if this is still an issue in the latest release and if you find that this is still a problem, please feel free to open a new one.

github-actions[bot] avatar Jul 05 '22 00:07 github-actions[bot]

Hi @MrMarkW

This can be achieved by either by including the following in your nextflow.config:

    report {
      enabled = true
      file = 's3://<agc-bucket>/<output-prefix>/report.html'
    }

    trace {
      enabled = true
      file = 's3://<agc-bucket>/<output-prefix>/trace.txt'
    }

    timeline {
      enabled = true
      file = 's3://<agc-bucket>/<output-prefix>/timeline.html'
    }

    dag {
      enabled = true
      file = 's3://<agc-bucket>/<output-prefix>/dag.html'
    }

or this in your MANIFEST.json

{
  "mainWorkflowURL": "main.nf",
  "inputFileURLs": [
    "inputs.json"
  ],
  "engineOptions": "-with-report s3://<agc-bucket>/<output-prefix>/report.html -with-trace s3://<agc-bucket>/<output-prefix>/trace.txt - with-timeline s3://<agc-bucket>/<output-prefix>/timeline.html -with-dag s3://<agc-bucket>/<output-prefix>/dag.html"
}

mmueller76 avatar Oct 05 '22 10:10 mmueller76