quip
quip copied to clipboard
Fix mate sequence when reading sam/bam
Previously upon reading the case of tid == mtid was detected and the sequence name mapped to "=". This causes missing sequence name errors upon decompression. As the case of tid == mtid is handled during writing of sam/bam, this patch simply records the full mate sequence name, resolving the matching issues.
Example read after decompression pre patch:
SL1344_1_530_0:0:0_0:0:0_6c9 163 SL1344 1 60 70M * 461 530 AGAGATTACGTCTGGTTGCAAGAGATCATGACAGGGGGAATTGGTTGAAAATAAATATATCGCCAGCAGC IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII MQ:i:60 AS:i:70 RG:Z:mysample1 NM:i:0 MC:Z:70M MD:Z:70 ms:i:2800 XS:i:0
and post patch:
SL1344_1_530_0:0:0_0:0:0_6c9 163 SL1344 1 60 70M = 461 530 AGAGATTACGTCTGGTTGCAAGAGATCATGACAGGGGGAATTGGTTGAAAATAAATATATCGCCAGCAGC IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII MQ:i:60 AS:i:70 RG:Z:mysample1 NM:i:0 MC:Z:70M MD:Z:70 ms:i:2800 XS:i:0