amigo
amigo copied to clipboard
Some AspGD taxon data appears without a label
there is something amiss with the way AspDB annotations appear in AmiGO.
Tha taxon isn't parsed correctly so they are not available in the organsim filter, E.g.
http://amigo.geneontology.org/amigo/gene_product/AspGD:Aspfo1_0204585
@ValWood That's a good catch--thank you. I'll look into this today.
Examining similarly formed entries from the GAF, this was not a uniform problem: http://amigo.geneontology.org/amigo/gene_product/AspGD:Aspka1_0181639
To note, the issue here seems to be that in some cases the taxon ID does not seem to get resolved to a label, which means that the main taxon entry is left "blank" and is left as an ID in the table.
As this is a relatively new annotation done on the day of the release, I wonder if somehow the ncbi taxon ontology could have been out of sync with the annotations, leading to a case where the label went AWOL.
Partially bum theory as 2019-06-23 http://amigo-exp.geneontology.io/amigo/gene_product/AspGD:Aspfo1_0204585 still has the information gap.
It might be something to do with taxon strain IDs vs strain IDs (some species have strain IDs in NCBI). I'm not completely sure what these particular IDs are but it's a possibility.
@marekskrzypek might be able to enlighten you?
It isn't restricted to AspDB
http://amigo.geneontology.org/amigo/gene_product/CGD:CORT_0G01250
@ValWood It seems to be the same taxon though: NCBITaxon:1136231 , which is a good thing. I do not think the problem resides in the GAF, rather likely in loader or the NCBITaxon file that we load.
Noting from load log:
[2019-06-10T12:09:23.763Z] 2019-06-10 12:09:23,648 INFO (GafSolrDocumentLoader:
189) Skipping taxon closures for unknown id: NCBITaxon:1136231
That's owltools, around
final OWLClass taxCls = graph.getOWLClassByIdentifier(taxonId);
within bioentity solr document assembly. That would seem like an issue at the ontology then. @balhoff Would you be able to officially confirm the presence or not of NCBITaxon:1136231 in "http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim.owl" ? Grepping shows that it is not there. If not, what are the channels to add it?
@cmungall I believe that you originally made the taxslim ontology? What would be the procedure for getting something in there? Re: https://github.com/geneontology/amigo/issues/570#issuecomment-505678832
@cmungall what is the origin of taxslim? Should we just expand GO ncbitaxon_import as needed and extract with ROBOT? Could keep a seed file in addition to the taxa directly referenced in the ontology.
there is something amiss with the way AspDB annotations appear in AmiGO.
Those were always like this. Only the ones coming from UniProt had the correct gene label.
Maybe this is not related but UniProt and AspDB and CGD weren't using the same tax id (although they were technically describing the same species).
Pascale
Does it need fixing upstream? Who do we tag?
@marekskrzypek