ExpansionHunter
ExpansionHunter copied to clipboard
Details on repeat unit of interrupted and alternate alleles
The two most common issues we have right now pertain to annotating the repeat units. In particular A) the precise repeat unit (for RFC1) present in a repeat that matches and B) interrupted- versus non-interrupted allele expansions as in ATXN1.
The latter (B) is perhaps to some extent covered by repeat purity, and may be solved by exposing/using it, but automatically recovering the longest uninterrupted pure sub-stretch would be useful.
It would be helpful for screening to be able to see already from the VCF if the discovered RFC1 alleles were AAGGG or AAAAG - or one of the other slightly more rare versions - and the zygosity to tell if the expanded locus was homozygous normal - or pathologic.
Thanks for the suggestion! We will work on enabling EH to annotate motif changes. Do you have any samples that have such mutations?
Sorry, I missed the reply! Yes, we do have for RFC1 - but I kind of think you do as well, right? 😸 From the top of my head at least for interrupted ATXN1, but that does seem to be the normal case, so I'm sure you do as well. We'll keep a lookout for an uninterrupted one, and just let me know if you would want slices for any of the others.
@dnil, this sounds good. Once we implement the initial version of motif change annotation algorithm, would you be up for running it on some of your data to confirm that the results look accurate?