BS-Snper icon indicating copy to clipboard operation
BS-Snper copied to clipboard

CIGAR strings with multiple operations not parsed properly

Open knowah opened this issue 4 years ago • 0 comments

I have a BAM file with reads that contain an insertion at a specific position. I noticed that the basecall counts in the --output file produced by rrbsSnp (and subsequently, the genotypes in the VCF file created by BS-Snper.pl) were inaccurate around this insertion.

The CIGAR string for these reads has three operations (e.g., 143M1I7M), but it appears that only the last operation is being processed, regardless of the number of CIGAR operations in a read. I confirmed this by checking the value of record->cigar and record->len at the end of sam_funcs.c:parseBuffer() - in the example above, the values are 7M and 7, respectively, meaning that only 7 bases are included in the MapRecord for this read (instead of 150), and the basecalls are recorded in the wrong reference positions.

knowah avatar Jan 09 '21 13:01 knowah