Output files are not being removed when snakemake execution is stopped via ctrl+c
Snakemake version
7.14.0
Describe the bug
When I stop snakemake execution via ctrl+c, the output files are not removed, but they are when the execution is stopped because of a fail in the rules.
Logs
Logs from the provided minimal example below.
When it triggers the described bug:
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
------- ------- ------------- -------------
sleep 6 1 1
targets 1 1 1
total 7 1 1
Select jobs to execute...
[Wed Sep 21 19:44:38 2022]
rule sleep:
output: testSleepS1
jobid: 1
reason: Missing output files: testSleepS1
wildcards: sample=S1
resources: tmpdir=/tmp, qq=3
Output created!
^CTerminating processes on user request, this might take some time.
[Wed Sep 21 19:44:40 2022]
Error in rule sleep:
jobid: 1
output: testSleepS1
shell:
touch testSleepS1
echo "Output created!"
sleep 10s
exit 1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Complete log: .snakemake/log/2022-09-21T194438.069552.snakemake.log
What I miss is the following line:
Removing output files of failed job split since they might be corrupted:
If I run ls testSleep* | wc -l, then the output is 1 (of course, it should be 0). Then, rm testSleep* and I run the version where it fails because of the rule, which it removes correctly the output file since it might be corrupted:
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
------- ------- ------------- -------------
sleep 6 1 1
targets 1 1 1
total 7 1 1
Select jobs to execute...
[Wed Sep 21 19:44:23 2022]
rule sleep:
output: testSleepS1
jobid: 1
reason: Missing output files: testSleepS1
wildcards: sample=S1
resources: tmpdir=/tmp, qq=3
Output created!
[Wed Sep 21 19:44:33 2022]
Error in rule sleep:
jobid: 1
output: testSleepS1
shell:
touch testSleepS1
echo "Output created!"
sleep 10s
exit 1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job sleep since they might be corrupted:
testSleepS1
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-09-21T194423.476724.snakemake.log
Here I see the line which I missed before: "Removing output files of failed job sleep since they might be corrupted". Now, ls testSleep* | wc -l output is 0.
Minimal example
rule targets:
input:expand("testSleep{sample}", sample=["S1","S2","S3","S4","S5","S6"])
rule sleep:
output:"testSleep{sample}"
resources: qq=3
threads: 2
shell:
"""
touch {output}
echo "Output created!"
sleep 10s
exit 1
"""
Execution: snakemake -c1 --snakefile /path/to/provided/snakefile --rerun-incomplete
If you let the example run until it fails because of exit 1, you can check out that ls $PWD/testSleep* | wc -l is 1, but if you remove this file, execute again and do ctrl+c when you see the message "Output created!", then ls $PWD/testSleep* | wc -l is 0, what means that the file is not being removed when the execution is stopped via ctrl+c.
Additional context
Since output files are not removed, I have to removed them manually in order to rerun, what is tedious in my case since I have to check a lot of files and a lot of different rules.
Since output files are not removed, I have to removed them manually in order to rerun
I think this is not entirely correct. If you hit CTRL-C in the middle of a job, when you re-run the pipeline you get:
Building DAG of jobs...
IncompleteFilesException:
The files below seem to be incomplete. If you are sure that certain files are not incomplete, mark them as complete with
snakemake --cleanup-metadata <filenames>
To re-generate the files rerun your command with the --rerun-incomplete flag.
Incomplete files:
testSleepS2
so if you add --rerun-incomplete you don't need to remove those files manually. However, I agree that it would be nice snakemake removed those files before exiting due to CTRL-C.
Ok, now I think I understand what's happened to me... This is a minimal example, but in my execution I'm moving .snakemake if I detect it exists because I'm running multiple instances of Snakemake (in the past this led to errors to me because of the folder .snakemake, so I decided to remove/move after every execution), so when I re-run if I hit ctrl+c, since files are not removed, they exist and are detected as finished because the folder .snakemake is created once again fresh (the information about the not-finished status is in the older .snakemake). In order to detect that these files were not finished correctly I understand I have to leave the .snakemake directory where it is, but this makes me wonder: is it "problematic" if I run multiple instances of Snakemake under the same .snakemake folder (I've run into situations where the instances failed to me because they were executed concurrently)? Can I change the path/name of the .snakemake with the CLI flags (I've checked out --shadow-prefix, but it only affects the shadow directory, not the whole .snakemake directory)?
Now I understand this is not a bug, but a feature (at least, I think so now). Files are not removed just in case that the content finished somehow or the uncompleted files are wanted for some reason, so Snakemake gives an error and the chance of ignore this error with --rerun-incomplete, which might be useful if, somehow, you know that your files were completed or you want to continue with the files uncompleted.
Now I think it is right to don't remove files when I stop the execution, but this leads me to wonder: why is different the behavior when I stop the execution with ctrl+c and when it's stopped because an error? In both situations we might want to keep those files.
Am I right about these thoughts?
An example about I'm talking about that led me to errors when running multiple instances of Snakemake:
for n in $(seq 1 100); do
snakemake -c lang1=en lang2=fr shard=$n --snakefile Snakefile &> $n.log &
done
In general, I wouldn't tweak the .snakemake directory and let snakemake handle it unless you are sure of what you are doing.
snakemake is right in preventing you from running multiple pipelines on the same output directory at the same time. Different instances will overwrite each other output causing a mess. I'd say that one of the reasons to use a workflow manager is to avoid such things to happen.
From top of my head, I would resolve your situation by either including the logic for n in $(seq 1 100) inside the Snakefile itself and running multiple jobs (snakemake -j 100 ...). Alternatively, assign to each instance a separate output directory, like:
for n in $(seq 1 100); do
snakemake -d output_$n -c lang1=en ...
done
there may be other/better solutions depending on your case but I would say this is the correct behaviour of snakemake.
I put a basic example, but of course the output files are different for each instance (output files are configured through different options provided with the flag -c and all output files are not overlapped among the different instances I execute). I don't totally agree with the idea of integrate this functionality in the pipeline. At least in my case, I'm using a pipeline where I give different input files and I get parallel text, and I want to run experiments where I want to quantify the total number of parallel text I get when I vary some configuration options. For these reasons, I think that the pipeline is making what it has to do, and since I want to quantify the total amount of text that I get modifying different options, I'd like to run multiple instances with different configuration options in order to parallelize the experiments and, of course, each instance is parallelized by Snakemake.
So, is not any way to modify the name of the .snakemakedirectory? I don't know, something like:
for n in $(seq 1 100); do
dot_snakemake_directory=".snakemake_$n" # default in snakemake is `.snakemake`
snakemake -c lang1=en lang2=fr input_files='["/path/to/WARC1", "/path/to/WARC2"]' \
shard=$n --snakefile Snakefile --dot-snakemake-directory-name "$dot_snakemake_directory" &> $n.log &
done
And about the different behaviors when the execution is stopped via ctrl+c or because a rule failed, is there any known reason?
Thank you for the previous replies! :)