gatb-minia-pipeline
gatb-minia-pipeline copied to clipboard
Flag to remove intermediate files
Hello,
Is there a flag/option to remove intermediate files (SAM, and the .glue files specifically) during a run of the pipeline? I'm running a large assembly and the folder is > 5 TB at the moment. I didn't see any options in the help or in this repo.
You can always remove the *hd5 files if your contig assemblies have finished ;)
Ofcourse I'm assuming this since you have reached the mapping stage.
I also tend to remove the glue files.
Yes, I do as well. I suppose it's a suggestion as an enhancement to the pipeline. During large assemblies where you aren't watching when to remove the files you may saturate a filesystem and cause the run to fail.
I had the same issue and created a modified version that has a --cleanup
flag that removes *.h5 and glu files after each iteration.
If one of the developers wants to review the code, I can make a PR. Otherwise I can just share the script if somebody ever needs it.
The script would be appreciated on my end, obviously I can't speak for the devs but it does seem like the ability to clean those files after each iteration would be highly desirable when assembling very large datasets.