biolink-model icon indicating copy to clipboard operation
biolink-model copied to clipboard

Support for Sequence Ontology terms from SnpEff

Open EvanDietzMorris opened this issue 3 years ago • 2 comments
trafficstars

Using SnpEff to predict sequence variant to gene relationships produces predicate terms from the Sequence Ontology (SO). Currently biolink mappings for Sequence Ontology terms are incomplete. https://pcingola.github.io/SnpEff/se_inputoutput/

Additionally, SnpEff returns Sequence Ontology predicate terms like "downstream_gene_variant" where numerical SO identifiers also exist (SO:0001632). These mappings could be done before biolink conversion but it might be nice to include them both if possible.

This brings up the question of how granular Biolink should be in supporting SO terms, where children and parent terms may both be used by SnpEff. For example, non_coding_transcript_variant (SO:0001619) has children non_coding_transcript_splice_region_variant, non_coding_transcript_exon_variant, non_coding_transcript_intron_variant and any of them could be used.

Currently: is nearby variant of: GAMMA:0000102 is non coding variant of: GAMMA:0000103

Possible additions: is nearby variant of: SO:0001632 (downstream_gene_variant), SO:0001631 (upstream_gene_variant) is non coding variant of: SO:0001627 (intron_variant), SO:0001619 (non_coding_transcript_variant), SO:0001970 (non_coding_transcript_intron_variant), (SO:0001792) non_coding_transcript_exon_variant

These are some examples that are currently completely missing from the biolink model: conservative_inframe_deletion: SO:0001825, conservative_inframe_insertion: SO:0001823, disruptive_inframe_deletion: SO:0001826, disruptive_inframe_insertion: SO:0001824, 3_prime_UTR_variant: SO:0001624, (and it's children) 5_prime_UTR_variant: SO:0001623 (and it's children)

This is coming from RENCI. Thanks.

EvanDietzMorris avatar Jun 01 '22 16:06 EvanDietzMorris

@EvanDietzMorris is this related to one of the Translator teams?

nlharris avatar Jul 27 '22 22:07 nlharris

That's correct. I am working with Translator gamma team at UNC. The old "GAMMA:00" curie mappings could be removed and replaced.

EvanDietzMorris avatar Jul 27 '22 23:07 EvanDietzMorris