snakefmt
snakefmt copied to clipboard
Misleading error associated with comments in multi-line statements
The following code:
rule call_peaks:
input:
"results/bed/{sample}_shifted.bed",
output:
multiext(
"results/peaks/{sample}/{sample}",
"_peaks.xls",
"_treat_pileup.bdg", # Output by -B
"_control_lambda.bdg", # This is produced even without a control input
"_peaks.narrowPeak", # Output when -broad not supplied
"_summits.bed", # Output when -broad not supplied
),
Produces the following confusing linting error (text pasted below, but screenshot added for readability):

The workflow is functional and I believe that comments added in this fashion can be useful, so I propose either:
- Allow trailing comments in multi-line statements, as are allowed in python linting
- Produce an error that is directly relevant to the nature of the linting violation and suggests a clear resolution (e.g. comments not allowed in multi-line statements)
2023-04-10 19:04:44 [INFO] File:[/github/workspace/workflow/rules/call_peaks.smk]
[665](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:666)
2023-04-10 19:04:44 [ERROR] Found errors in [snakefmt] linter!
[666](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:667)
2023-04-10 19:04:44 [ERROR] Error code: 123. Command output:
[667](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:668)
------
[668](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:669)
Error: In file "/github/workspace/workflow/rules/call_peaks.smk": InvalidPython: Black error:
[669](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:670)
\```
[670](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:671)
Cannot parse: 56:0: EOF in multi-line statement
[671](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:672)
\```
[672](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:673)
[673](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:674)
[INFO] In file "/github/workspace/workflow/rules/call_peaks.smk": 1 file(s) raised parsing errors 🤕
[674](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:675)
------
[675](https://github.com/PrincetonUniversity/ATACCompendium/actions/runs/4660358548/jobs/8248322406?pr=26#step:6:676)
Thanks for the report @hepcat72. This is indeed a bug. We'll get around to trying to fix this soon. Thanks for your patience.
Just to confirm, what version are you running?
$ snakefmt --version
snakefmt, version 0.8.4
And whatever version github/super-linter/slim@v4 uses.
I think I've encountered the same bug, also on 0.8.4. Could be helpful as an extra test case:
rule run_nextclade:
input:
nextclade="nextclade",
dataset=lambda w: f"data/nextclade_data/sars-cov-2{w.reference.replace('_','-')}.zip",
sequences=f"data/{database}/nextclade{{reference}}.sequences.fasta",
params:
genes=GENES_SPACE_DELIMITED,
translation_arg=lambda w: (
# Nextclade takes a filename template in which it replaces {gene}
# itself, so we want to pass thru {gene} literally to it and make
# sure it isn't interpretted by the shell as a glob. We shellquote
# here instead of with :q below because we don't want to pass an
# empty string argument when this param is empty.
shellquote(f"--output-translations=data/{database}/nextclade{w.reference}.translation_{{gene}}.upd.fasta")
if w.reference == ""
else ""
),
output: # <---- this is the line where snakefmt reports the error (line 185)
info=f"data/{database}/nextclade{{reference}}_new_raw.tsv",
Error:
❯ snakefmt --verbose workflow/snakemake_rules/nextclade.smk
[DEBUG]
snakefmt.exceptions.InvalidPython: Black error:
Cannot parse: 185:0: EOF in multi-line statement