poretools icon indicating copy to clipboard operation
poretools copied to clipboard

Increase in read number from fast5 to fastq

Open emilyjunkins opened this issue 7 years ago • 1 comments

Hello,

I have just used the fastq converter on my albacore basecalled fast5 files and noticed that the number of reads increases. For instance if count the number of fast5 files in a directory containing only fast5 files $ls ./ | wc -l

9231

But when I convert to fastq the number of 'reads' will increase.. $ poretools fastq ./ | grep '^@' | wc -l

20037

Is this what I should expect? or is this something wrong with what I am doing or not understanding about the conversion from fast5 to fastq?

emilyjunkins avatar Apr 19 '17 21:04 emilyjunkins

These may be 2D reads, in which case you are getting template strands, as well as complement and 2D reads. If you just want 2D add --type 2D and if you just want template add --type strand.

Additionally I do not think your grep command will be necessarily accurate as I believe the quality score line can start with an @ character. I prefer to do a wc -l without the grep and divide by 4 to get read count.

nickloman avatar Apr 19 '17 21:04 nickloman