snakemake
snakemake copied to clipboard
Report generation fails when script path contains wildcard
Snakemake version
7.8.1
Describe the bug
When a rule has a wildcard in the path to an external script, --report
is unable to parse the script path for inclusion.
Logs
Snakemake execution
$ snakemake --cores 1
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
---------- ------- ------------- -------------
all 1 1 1
use_script 2 1 1
total 3 1 1
Select jobs to execute...
[Wed Jun 1 15:01:37 2022]
rule use_script:
input: data/file1.txt
output: results/file.bar.txt
jobid: 2
reason: Missing output files: results/file.bar.txt
wildcards: filetype=bar
resources: tmpdir=/tmp
[Wed Jun 1 15:01:38 2022]
Finished job 2.
1 of 3 steps (33%) done
Select jobs to execute...
[Wed Jun 1 15:01:38 2022]
rule use_script:
input: data/file1.txt
output: results/file.foo.txt
jobid: 1
reason: Missing output files: results/file.foo.txt
wildcards: filetype=foo
resources: tmpdir=/tmp
[Wed Jun 1 15:01:38 2022]
Finished job 1.
2 of 3 steps (67%) done
Select jobs to execute...
[Wed Jun 1 15:01:38 2022]
localrule all:
input: results/file.foo.txt, results/file.bar.txt
jobid: 0
reason: Input files updated by another job: results/file.bar.txt, results/file.foo.txt
resources: tmpdir=/tmp
[Wed Jun 1 15:01:38 2022]
Finished job 0.
3 of 3 steps (100%) done
Complete log: .snakemake/log/2022-06-01T150136.792964.snakemake.log
Report generation
$ snakemake --report report.html
Building DAG of jobs...
Creating report...
Loading script code for rule use_script
WorkflowError:
Failed to open source file /home/grahman/projects/snakemake-bug/workflow/scripts/process_{wildcards.filetype}.py
FileNotFoundError: [Errno 2] No such file or directory: '/home/grahman/projects/snakemake-bug/workflow/scripts/process_{wildcards.filetype}.py'
File tree
$ tree
.
├── data
│ └── file1.txt
├── results
│ ├── file.bar.txt
│ └── file.foo.txt
└── workflow
├── report
│ └── workflow.rst
├── rules
├── scripts
│ ├── process_bar.py
│ └── process_foo.py
└── Snakefile
6 directories, 7 files
Minimal example
Snakefile
report: "report/workflow.rst"
rule all:
input:
expand("results/file.{filetype}.txt", filetype=["foo", "bar"])
rule use_script:
input:
"data/file1.txt"
output:
"results/file.{filetype}.txt"
script:
"scripts/process_{wildcards.filetype}.py"
scripts/process_foo.py
with open(snakemake.output[0], "w") as f:
f.write("This is process_foo.py")
scripts/process_bar.py
with open(snakemake.output[0], "w") as f:
f.write("This is process_bar.py")
Additional context
The actual workflow works fine - the files in all
are generated properly. However, I would expect that using --report
would be able to find the correct Python script (based on the wildcard) to be loaded into the report.
So I have been investigating this and I think I have a hypothesis for why this occurs.
In the following line, the script file is checked for any wildcards. This check is evaluating to False
.
https://github.com/snakemake/snakemake/blob/531ef4a6b962b06ef865ba34ea6e952a7d4b3a2b/snakemake/report/init.py#L292
As a result, none of the if
statements are True
and Snakemake assumes that it is dealing with a run
directive.
https://github.com/snakemake/snakemake/blob/531ef4a6b962b06ef865ba34ea6e952a7d4b3a2b/snakemake/report/init.py#L321-L325
I'm not super well-versed in how Snakemake handles wildcards but I believe the solution is to determine whether the script has wildcards in the {wildcards.<name>}
format and retrieve the correct source code depending on the completed script path.