dipper icon indicating copy to clipboard operation
dipper copied to clipboard

OBI:0100026 as taxon for variant objects

Open deepakunni3 opened this issue 5 years ago • 10 comments

This bug was originally raised by @iimpulse

For reference, the query: https://api-dev.monarchinitiative.org/api/bioentity/gene/MGI:98297/variants?fetch_objects=true&start=0&rows=10&facet=true&facet_fields=subject_taxon&taxon=OBI%3A0100026

yields gene to variant associations. But there are some associations where object has a BNODE prefix, and a taxon of OBI:0100026 'organism'. This makes it difficult to filter variants based on taxon since 'organism' is too generic of a taxon term.

@cmungall @kshefchek Thoughts?

CC'ing @monicacecilia for her awesomeness!

deepakunni3 avatar Nov 06 '19 16:11 deepakunni3

we originally attempted to infer taxon on genotype parts, instead of making an explicit edge for each (in retrospect maybe a mistake). Theres an inference path in the solr loader that has been broken for some time, in the sense that it either doesn't infer the taxa or infers 'organism'.

tl;dr this should probably be fixed in dipper and likely won't be in the short term

kshefchek avatar Nov 06 '19 17:11 kshefchek

see also - https://github.com/SciGraph/golr-loader/issues/10

kshefchek avatar Nov 06 '19 17:11 kshefchek

Thats good to know.

Thanks @kshefchek

So this is blocked by Dipper or SciGraph loader? or both?

deepakunni3 avatar Nov 06 '19 18:11 deepakunni3

The way it works now, this could either be fixed in dipper or the golr-loader code. We could also add something in scigraph but it would be some new post processor. I think the best thing to do is to add it in dipper.

kshefchek avatar Nov 06 '19 18:11 kshefchek

looking closer, many of these are transgenes, so which taxon applies? I would think the taxon in which the variant is studied but that is not entirely accurate.

kshefchek avatar Nov 07 '19 23:11 kshefchek

@mbrush Are you able to join the monarch-ui call on Tuesday November 12?

This ticket is in relation to representation of variants, and we believe we need your help.

iimpulse avatar Nov 08 '19 17:11 iimpulse

Hi. Happy to join call on Tuesday. In the meantime, Appendix I of this document provides food for thought that I think is relevant to this topic. It gets pretty into the weeds concerning what it means to be a 'transgene' or an 'allele' from the GENO perspective. But the key bits are in the third paragraph that starts with "An allele . . . "). Copying key text below, but see document for broader context.

An allele in GENO, including those caused by insertions, is an allele_of some reference genomic feature. This feature is typically a gene, but even insertions falling outside of genes are considered alleles_of the reference feature they alter (e.g. alleles of other named features such as QTLs). The feature or gene that an allele is an allele_of is entirely dependent on its genomic position, and not on the sequence content it contains. For example, insertion of the S. cerevisiae GAL4 gene sequence within the D. melanogaster Bx gene locus would create an allele_of this Bx gene, but the resulting transgene would not be considered an allele_of the S. cerevisiae GAL4 gene - because positionally it is not located in a yeast genome at the yeast GAL4 locus. Rather, GENO would say that this transgene derives_sequence_from the S. cerevisiae GAL4 gene.

mbrush avatar Nov 08 '19 20:11 mbrush

I am confused about why we are talking about an OBI ID in the first place. We shouldn't be using the OBI class for organism.

cmungall avatar Nov 12 '19 01:11 cmungall

@cmungall this comes from a multi integration issue, first from running elk on geno, then attempting to infer taxon via this graph path search: https://github.com/SciGraph/golr-loader/blob/master/src/main/java/org/monarch/golr/GolrLoader.java#L157

kshefchek avatar Nov 12 '19 06:11 kshefchek

@kshefchek send along the list of troublesome IDs when you get a chance, and I"ll figure out what Dipper ingests needs to be corrected here

justaddcoffee avatar Nov 18 '19 20:11 justaddcoffee