snakemake Report generation fails when script path contains wildcard

Report generation fails when script path contains wildcard

Open gibsramen opened this issue 2 years ago • 1 comments

Snakemake version

7.8.1

Describe the bug

When a rule has a wildcard in the path to an external script, --report is unable to parse the script path for inclusion.

Logs

Snakemake execution

$ snakemake --cores 1
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job           count    min threads    max threads
----------  -------  -------------  -------------
all               1              1              1
use_script        2              1              1
total             3              1              1

Select jobs to execute...

[Wed Jun  1 15:01:37 2022]
rule use_script:
    input: data/file1.txt
    output: results/file.bar.txt
    jobid: 2
    reason: Missing output files: results/file.bar.txt
    wildcards: filetype=bar
    resources: tmpdir=/tmp

[Wed Jun  1 15:01:38 2022]
Finished job 2.
1 of 3 steps (33%) done
Select jobs to execute...

[Wed Jun  1 15:01:38 2022]
rule use_script:
    input: data/file1.txt
    output: results/file.foo.txt
    jobid: 1
    reason: Missing output files: results/file.foo.txt
    wildcards: filetype=foo
    resources: tmpdir=/tmp

[Wed Jun  1 15:01:38 2022]
Finished job 1.
2 of 3 steps (67%) done
Select jobs to execute...

[Wed Jun  1 15:01:38 2022]
localrule all:
    input: results/file.foo.txt, results/file.bar.txt
    jobid: 0
    reason: Input files updated by another job: results/file.bar.txt, results/file.foo.txt
    resources: tmpdir=/tmp

[Wed Jun  1 15:01:38 2022]
Finished job 0.
3 of 3 steps (100%) done
Complete log: .snakemake/log/2022-06-01T150136.792964.snakemake.log

Report generation

$ snakemake --report report.html
Building DAG of jobs...
Creating report...
Loading script code for rule use_script
WorkflowError:
Failed to open source file /home/grahman/projects/snakemake-bug/workflow/scripts/process_{wildcards.filetype}.py
FileNotFoundError: [Errno 2] No such file or directory: '/home/grahman/projects/snakemake-bug/workflow/scripts/process_{wildcards.filetype}.py'

File tree

$ tree
.
├── data
│   └── file1.txt
├── results
│   ├── file.bar.txt
│   └── file.foo.txt
└── workflow
    ├── report
    │   └── workflow.rst
    ├── rules
    ├── scripts
    │   ├── process_bar.py
    │   └── process_foo.py
    └── Snakefile

6 directories, 7 files

Minimal example

Snakefile

report: "report/workflow.rst"

rule all:
    input:
        expand("results/file.{filetype}.txt", filetype=["foo", "bar"])

rule use_script:
    input:
        "data/file1.txt"
    output:
        "results/file.{filetype}.txt"
    script:
        "scripts/process_{wildcards.filetype}.py"

scripts/process_foo.py

with open(snakemake.output[0], "w") as f:
    f.write("This is process_foo.py")

scripts/process_bar.py

with open(snakemake.output[0], "w") as f:
    f.write("This is process_bar.py")

Additional context

The actual workflow works fine - the files in all are generated properly. However, I would expect that using --report would be able to find the correct Python script (based on the wildcard) to be loaded into the report.

Jun 01 '22 22:06 gibsramen

So I have been investigating this and I think I have a hypothesis for why this occurs.

In the following line, the script file is checked for any wildcards. This check is evaluating to False.

https://github.com/snakemake/snakemake/blob/531ef4a6b962b06ef865ba34ea6e952a7d4b3a2b/snakemake/report/init.py#L292

As a result, none of the if statements are True and Snakemake assumes that it is dealing with a run directive.

https://github.com/snakemake/snakemake/blob/531ef4a6b962b06ef865ba34ea6e952a7d4b3a2b/snakemake/report/init.py#L321-L325

I'm not super well-versed in how Snakemake handles wildcards but I believe the solution is to determine whether the script has wildcards in the {wildcards.<name>} format and retrieve the correct source code depending on the completed script path.

Jul 11 '22 23:07 gibsramen

snakemake snakemake copied to clipboard

Report generation fails when script path contains wildcard

snakemake
snakemake copied to clipboard