FastQC icon indicating copy to clipboard operation
FastQC copied to clipboard

Support separate analysis of R1/R2 in paried end BAM files

Open s-andrews opened this issue 5 years ago • 1 comments

If we are reading from a BAM file we currently generate a report which is a mix of the two reads, which isn't ideal since some effects will target both of them.

To give some options for this can we:

  1. Add a command line option (maybe --first and --second to only analyse one of the pair of reads

  2. Add a warning if someone analyses a paired end BAM file without specifying either of the above options

Ideally it would be nice to make it so that two reports were generated in a single pass, but that's going to take some serious re-working of some of the internals. The above points should be fairly quick to implement.

s-andrews avatar Dec 03 '20 14:12 s-andrews

As mentioned in email correspondence, having it automatically dispatch threads for --first and --second when detected would be a nice feature. While separate threads that both separately the source might not be quite as efficient as a single thread that scans and divvies out per-read, it might be easier to implement, and from a disk IO perspective I imagine that if the threads were started simultaneously, both would still be in the read cache most of the time.

alanhoyle avatar Dec 03 '20 14:12 alanhoyle