sc2-illumina-pipeline icon indicating copy to clipboard operation
sc2-illumina-pipeline copied to clipboard

Unpaired reads and bad test sample

Open jackkamm opened this issue 4 years ago • 1 comments

While getting single-end reads working, I noticed a strange behavior with the test sample RR057e_00734_subsampled.

It seems like there's an issue with the mate pairing for this test sample. When I align it single end, or skip the host filtering and kraken steps, then it recovers a full genome. In the latter case (paired but without filtering), very few of the aligned reads are properly paired.

When I run the pipeline under the standard settings (paired with filtering steps), then the unpaired alignments seem to get discarded, resulting in no genome recovered.

However, the headers for the reads in the fastq seem to be properly matched.

We should do 2 things:

  • Remove this weird sample and replace it with a better one for the unit test.
  • Consider whether we want to keep aligned reads whose mates don't align. It seems like they are getting discarded now.

jackkamm avatar Jun 11 '20 12:06 jackkamm