buttery-eel icon indicating copy to clipboard operation
buttery-eel copied to clipboard

add a --resume option

Open Psy-Fer opened this issue 11 months ago • 0 comments

sometimes a run will crash or hit a wall limit due to hardware of system limits.

It would be helpful to easily resume the sequencing run.

I think a quick way to do this, would be to read the sam/fastq already written and put the readIDs into a set, and then start reading the blow5 again, doing a quick set check on each readID.

Only initial overhead is extracting readIDs from the output file.

If original output was something like --output my.fastq then arg for resume would be --resume my.fastq and --output my2.fastq and then the user can merge them at the end.

If it's a single file, that's easy just read it, fastq or sam, use extension for detection If it's multiple files, like pass/fail or barcoding. then have the option of multiple args, like --resume file1.fastq file2.fastq, or give a path like --resume ./somepath/ and do a dir check, and recursively read all files using extension check for fastq/sam.

That should cover almost every case. if someone breaks it, they can tell me what they need.

Psy-Fer avatar Mar 13 '24 04:03 Psy-Fer