ExpansionHunter
ExpansionHunter copied to clipboard
Exome Sequencing - Accuracy issues?
Hello Igor,
I am currently extracting STR from 1000 Genome WGS and a WES dataset. When comparing the distributions of the STRs used in the catalog between the two datasets, I can see some discrepancies. I do understand the obvious limitation of WGS (read lentgh 100bp) Vs WES (76 bp). Should I tweak any parameter when extracting STRs from WES?
Thank you in advance.
With Kind Regards, William
Hello William,
Great question. It is quite a bit harder to genotype STRs in WES than in PCR-free WGS because of the less even read coverage / amplification biases inherent to WES. To maintain good accuracy, EH requires that the repeat region plus 1Kb flanks on both sides are sequenced to a relatively even coverage. In WES, some repeat regions might be only partially covered by reads or the interior of the repeat may be amplified less well due to GC bias. This is why EH officially supports only PCR-free WGS.
This said, I know of multiple projects that obtained useful results from WES data. This usually required extra benchmarking to delineate which repeats can be accurately called from WES.
Did I answer your question? Please let me know if you have any follow up questions or comments.
Best wishes, Egor