SO-Ontologies
SO-Ontologies copied to clipboard
What's so special about splice_donor_5th_base_variant?
Hi, Does anyone happen to have a reference supporting the relative importance of donor splice variants 5 bases downstream of the exon? I saw a mention of the splice_donor_5th_base_variant annotation in #403, but I haven't yet found any literature that motivates the special attention. I would appreciate any guidance, thanks!
There are two splicing complexes known as U2/major and U12/minor. There's a diagram of the consensus sites for both on this page: https://en.wikipedia.org/wiki/Minor_spliceosome
For U2, the full donor consensus extends two bases into the exon and the first 6 bases of the intron: GU|GU[AG]AGU
The canonical GT donor is by-far the strongest part of that motif, although U2 splicing can also use GC, AT, or rarely some other donor dinucleotides. The rest of the motif is weaker, but the G at position 5 is stronger than the rest. There's a sequence logo in figure 1 of this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4329672/
There's probably something newer published somewhere, but that looks like what I would expect. From that I would think an SO term for the -1 position would also be warranted, and some of the other positions to a lesser extent. And the U12 donor consensus is MUCH stronger across all 9 bases, but U12 is only ~1% of splicing and there may not be any known disease-causing U12-donor mutations to motivate special terms.
This is the work that prompted us to flag these in Ensembl VEP: https://genome.cshlp.org/content/29/2/159
I think the term should be updated to add something of the above to the description.
Thanks for the reference, @sarahhunt. (And thank you @murphyte for the additional context.) It's helpful to know the motivation for the annotation, so I agree that putting the reference somewhere in the ontology description would be useful.