amazon-genomics-cli icon indicating copy to clipboard operation
amazon-genomics-cli copied to clipboard

Snakemake workflow fails on completion with output directories

Open ElDeveloper opened this issue 2 years ago • 0 comments

Describe the Bug

Running a Snakemake workflow with directories as outputs makes the whole run fail. Reason being that snakemake.aws.sh attempts to copy all outputs listed from running snakemake --list-output -q as if they were files i.e. using aws s3 cp .... but this command will produce a directory name for an output that was marked as directory so that cp operation will fail.

Steps to Reproduce

Run a snakemake workflow that has a directory as an output, for example:

    output:
        db=directory("dbs/dope-db/"),

Relevant Logs

Here's a redacted version of the output I saw from a recent run, as you can tell dope-db fails in the copy step.

Mon, 02 May 2022 16:23:37 -0700	unlocking
Mon, 02 May 2022 16:23:37 -0700	removing lock
Mon, 02 May 2022 16:23:37 -0700	removing lock
Mon, 02 May 2022 16:23:37 -0700	removed all locks
Mon, 02 May 2022 16:23:37 -0700	Snakmake outputs are:
Mon, 02 May 2022 16:23:38 -0700	Building DAG of jobs...
Mon, 02 May 2022 16:23:39 -0700	Updating job all.
Mon, 02 May 2022 16:23:39 -0700	Updating job tabulate.
Mon, 02 May 2022 16:23:39 -0700	Updating job tabulate.
Mon, 02 May 2022 16:23:39 -0700	Updating job tabulate.
Mon, 02 May 2022 16:23:41 -0700	output_file	date	rule	version	log-file(s)	status	plan
Mon, 02 May 2022 16:23:41 -0700	alignments/redacted.sam	Mon May  2 23:20:06 2022	align	-		ok	

[...]

Mon, 02 May 2022 16:23:41 -0700	dbs/dope-db	Mon May  2 23:09:11 2022	download	-		ok	no update
Mon, 02 May 2022 16:23:42 -0700	copying outputs efs with s3
Mon, 02 May 2022 16:23:42 -0700	attempt 1 at copying alignments/redacted.sam to s3://[redacted]/redacted.sam
upload: alignments/redacted.sam to s3://[redacted]/redacted.sam
Mon, 02 May 2022 16:25:29 -0700	attempt 1 at copying dbs/dope-db to s3://[redacted]/dope-db
Mon, 02 May 2022 16:25:29 -0700	upload failed: dbs/dope-db/ to s3://[redacted]/dope-db [Errno 21] Is a directory: '/mnt/efs/snakemake/[redacted]/dbs/dope-db/'
Mon, 02 May 2022 16:25:30 -0700	=== Running Cleanup ===
Mon, 02 May 2022 16:25:30 -0700	=== Bye! ===

Expected Behavior

The copying step should know how to handle directory copying (probably with aws s3 sync).

Actual Behavior

snakemake.aws.sh fails to copy a directory and any of the subsequent outputs.

Screenshots

Additional Context

Operating System: AGC Version: 1.4.0 Was AGC setup with a custom bucket: no Was AGC setup with a custom VPC: no

ElDeveloper avatar May 11 '22 18:05 ElDeveloper