GangSTR icon indicating copy to clipboard operation
GangSTR copied to clipboard

Allowing mismatch in Long str

Open DomManou opened this issue 2 years ago • 0 comments

Hi,

I am trying to use GangSTR to identify a 52bp long STR in several samples. I have dentified my desired STR using the UCSC genome browser and repeatmasker (link: https://genome-euro.ucsc.edu/cgi-bin/hgc?hgsid=286603936_wMmBef9vkWmZKbnIKQ8k97LFAdv3&db=hub_51387_GCA_905237065.2&c=HG993268.2&l=74885285&r=74888665&o=74886109&t=74886738&g=hub_51387_simpleRepeat&i=TGTCTCTCTGACCCAC).

I see that although the STR sequence seems repetitive in the browser, the actual STR motif is not completely identical every time it is repeated. As a result in my output .vcf file

  1. the reference allele shows as an identically repeated motif which is not the actual case
  2. all sample genotyping for the particular STR is returned as "."

Is there a way to account for possible mismatches within the repeats?

Best regards, Domniki

DomManou avatar May 16 '22 13:05 DomManou