GangSTR
GangSTR copied to clipboard
Downstream filtering for haploid TRs
Nice tool! I Am trying to implement this tool for identifying TRs expansion in a bacterial genome. Though gangSTR works perfectly well, the problem is with subsequent filtering with dumpSTR (from TRTools). All sites are filtered out even after lowering the threshold, which I guess is an issue that starts from the gangSTR stage. Are there specific considerations when --ploidy 1 is set and the genome is haploid?
Hi @Owuorgpo Unfortunately --ploidy 1 is relatively under-tested and issues are prevalent in both GangSTR and TRTools. Is it perhaps possible to share the VCF/BAM files that were used in your run to help in debugging? Note: I'm currently only working part-time at the lab, so development is going to be a bit slow. Apologies in advance!
Sample.vcf.gz Sure, here is a VCF file from gangSTR. This is unfiltered, straight from gangSTR with the default setting at --ploidy 1. Though dumpSTR is expected to drop most of these that are not supported even in the BAM file, it ends up dropping everything, including TR expansions we have validated before
Hi @Owuorgpo, Sorry for very delayed response. My plate has been quite full with an internship that I recently started. Unfortunately I won't be able to debug this immediately. I will keep the issue open to get to it as soon as I can set aside some time.
Hi @nmmsv, I am getting a similar error filtering calls using DumpSTR when --ploidy 1 option or --samp-sex are used on haploid Chromosomes. The process terminates with an error . I am working on human WGS. See attached sample VCFs. This is the error: " ml = [int(item) for item in sample["REPCN"]] ; TypeError: 'int' object is not iterable "