bowtie2
bowtie2 copied to clipboard
Only one of two paired reads marked "read mapped in proper pair" (flag 2) in FLAG field
I have been running bowtie2 on paired end sequences and noticed something funny when I tried to filter out reads that weren't properly paired. I seemed to be getting orphan reads left behind after filtering with samtools view -f 2. After digging around, I noticed that after the initial alignment, I was getting reads that were flagged 163 and 113.
NS500418:295:HWG3TBGXX:4:11405:20193:10248 163 chr1 1624577 255
NS500418:295:HWG3TBGXX:4:11405:20193:10248 113 chr1 1677965 255
Running samtools fixmate changes the flags to 163 and 81 , but that still leaves the bit 2 unaffected.
As I understand it, it seems like this means that one of the two is "read mapped in proper pair" as indicated by flag 2, but the other one isn't, which I don't think should be possible if I understand it correctly.
Here is the bowtie command I ran:
bowtie2 -k1 -N1 -p16 -x hg19 -1 R1.fastq -2 R2.fastq -S file.sam
Found in version 2.1.0, but verified I was still getting the same results in 2.3.3
Is it possible for you to share the paired-end read that yielded that behavior?
This issue can be triggered by "readthrough" reads, where insert size is shorter than one of the reads.
scheme:
------->
<---
example:
Read 97 ref 4 30 47M = 9 0
CTGTCTCGACGTTTAAAGGCATTCAAGCCTAGAATTACACCATAATT
|||||||||||||||||||||||||||||||||||||||||||||||
CTGTCTCGACGTTTAAAGGCATTCAAGCCTAGAATTACACCATAATT
Read 145 ref 9 30 40M = 4 0
TCGACGTTTAAAGGCATTCAAGCCTAGAATTACACCATAA
||||||||||||||||||||||||||||||||||||||||
TCGACGTTTAAAGGCATTCAAGCCTAGAATTACACCATAA
read1: CTGTCTCGACGTTTAAAGGCATTCAAGCCTAGAATTACACCATAATT read2: CTGTCTCGACGTTTAAAGGCATTCAAGCCTAGAATTACACCATAATT
reference: ...CTGTCTCGACGTTTAAAGGCATTCAAGCCTAGAATTACACCATAATT...