buttery-eel
buttery-eel copied to clipboard
add a --resume option
sometimes a run will crash or hit a wall limit due to hardware of system limits.
It would be helpful to easily resume the sequencing run.
I think a quick way to do this, would be to read the sam/fastq already written and put the readIDs into a set, and then start reading the blow5 again, doing a quick set check on each readID.
Only initial overhead is extracting readIDs from the output file.
If original output was something like --output my.fastq
then arg for resume would be --resume my.fastq
and --output my2.fastq
and then the user can merge them at the end.
If it's a single file, that's easy just read it, fastq or sam, use extension for detection
If it's multiple files, like pass/fail or barcoding. then have the option of multiple args, like --resume file1.fastq file2.fastq
, or give a path like --resume ./somepath/
and do a dir check, and recursively read all files using extension check for fastq/sam.
That should cover almost every case. if someone breaks it, they can tell me what they need.