Single Exon node with the name `Exon`
I might have mentioned it before, but there is only a single node with the category biolink:Exon: a node with the name Exon. I think either the ETL-ing of whatever KP has exon info is borked, or something else fishy might be going on. Otherwise, should this node (and the category) just be removed?
This is the single biolink:Exon node in KG2 (checked in RTX-KG2.9.0pre):
{
"iri": "http://www.ebi.ac.uk/efo/EFO_0004423",
"synonym": [
"exonic region"
],
"category_label": "exon",
"deprecated": "False",
"name": "exon",
"description": "An exon is a nucleic acid sequence that is represented in the mature form of an RNA molecule either after portions of a precursor RNA (introns) have been removed by cis-splicing or when two or more precursor RNA molecules have been ligated by trans-splicing.",
"provided_by": "['infores:efo']",
"id": "EFO:0004423",
"category": "biolink:Exon",
"update_date": "3630"
}
This node comes from EFO, which is in the multi ont load process. I would not be surprised if that ETL is "borked". I will take a look to see where this is coming from.
Here is the term in efo.owl:
<!-- http://www.ebi.ac.uk/efo/EFO_0004423 -->
<owl:Class rdf:about="http://www.ebi.ac.uk/efo/EFO_0004423">
<rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/BFO_0000040"/>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000050"/>
<owl:someValuesFrom rdf:resource="http://www.ebi.ac.uk/efo/EFO_0004422"/>
</owl:Restriction>
</rdfs:subClassOf>
<obo:IAO_0000115>An exon is a nucleic acid sequence that is represented in the mature form of an RNA molecule either after portions of a precursor RNA (introns) have been removed by cis-splicing or when two or more precursor RNA molecules have been ligated by trans-splicing.</obo:IAO_0000115>
<oboInOwl:hasDbXref>NCIt:C13231</oboInOwl:hasDbXref>
<oboInOwl:hasDbXref>SNOMEDCT:33091005</oboInOwl:hasDbXref>
<oboInOwl:hasExactSynonym>exonic region</oboInOwl:hasExactSynonym>
<rdfs:label>exon</rdfs:label>
</owl:Class>
EFO:0004423 is a subclass of material entity (BFO:0000040), along with several other similar terms. It looks like the same issue also shows up with a different subclass of material entity like enzyme:
{
"iri": "http://purl.obolibrary.org/obo/OBI_0000427",
"category_label": "protein",
"deprecated": "False",
"name": "enzyme",
"description": "(protein or rna) or has_part (protein or rna) and has_function some GO:0003824 (catalytic activity); (protein or rna) or has_part (protein or rna) and has_function some GO:0003824 (catalytic activity)",
"provided_by": "['infores:efo', 'infores:genepio']",
"id": "OBI:0000427",
"category": "biolink:Protein",
"update_date": "2024-02-21 01:39:56 GMT"
}
These are all of the subclasses of material entity:
Running
match (n) where n.iri in ["http://purl.obolibrary.org/obo/BTO_0002690", "http://www.ebi.ac.uk/efo/EFO_0004446", "http://purl.obolibrary.org/obo/BTO_0000214", "http://www.ebi.ac.uk/efo/EFO_0000324", "http://purl.obolibrary.org/obo/GO_0005575", "http://www.ebi.ac.uk/efo/EFO_0006794", "http://purl.obolibrary.org/obo/CHEBI_24431", "http://www.ebi.ac.uk/efo/EFO_0005066", "http://www.ebi.ac.uk/efo/EFO_0000469", "http://purl.obolibrary.org/obo/OBI_0000427", "http://www.ebi.ac.uk/efo/EFO_0004422", "http://www.ebi.ac.uk/efo/EFO_0004423", "http://purl.obolibrary.org/obo/SO_0000704", "http://www.ebi.ac.uk/efo/EFO_0004420", "http://www.ebi.ac.uk/efo/EFO_0000548", "http://www.ebi.ac.uk/efo/EFO_0005060", "http://purl.obolibrary.org/obo/OBI_0100026", "http://www.ebi.ac.uk/efo/EFO_0000635", "http://purl.obolibrary.org/obo/OBI_0000245", "http://purl.obolibrary.org/obo/MPATH_0", "http://www.ebi.ac.uk/efo/EFO_0000663", "http://purl.obolibrary.org/obo/OBI_0000181", "http://www.ebi.ac.uk/efo/EFO_0010579", "http://purl.obolibrary.org/obo/OBI_0100051", "http://www.ebi.ac.uk/efo/EFO_0004359", "http://purl.obolibrary.org/obo/BTO_0001384", "http://purl.obolibrary.org/obo/OBI_0100051"] return n.id, n.name, n.category, n.provided_by
on kg2endpoint-kg2-9-0.rtx.ai we get:
| n.id | n.name | n.category | n.provided_by |
|---|---|---|---|
| "GO:0005575" | "cellular_component" | "biolink:CellularComponent" | "['infores:efo', 'infores:cl', 'infores:go-plus', 'infores:hpo', 'infores:mondo', 'infores:nbo', 'infores:pato', 'infores:pr', 'infores:uberon', 'infores:go']" |
| "CHEBI:24431" | "chemical entity" | "biolink:MolecularEntity" | "['infores:efo', 'infores:chebi', 'infores:cl', 'infores:disease-ontology', 'infores:foodon', 'infores:genepio', 'infores:go-plus', 'infores:hpo', 'infores:mondo', 'infores:nbo', 'infores:pato', 'infores:pr', 'infores:uberon']" |
| "OBI:0100026" | "organism" | "biolink:PhysicalEntity" | "['infores:efo', 'infores:foodon', 'infores:genepio', 'infores:go-plus', 'infores:pato', 'infores:pr', 'infores:ro']" |
| "SO:0000704" | "gene" | "biolink:Gene" | "['infores:efo', 'infores:disease-ontology', 'infores:go-plus', 'infores:mondo', 'infores:pr', 'infores:uberon']" |
| "OBI:0100051" | "specimen" | "biolink:PhysicalEntity" | "['infores:efo', 'infores:genepio']" |
| "EFO:0006794" | "cerebrospinal fluid biomarker measurement" | "biolink:InformationContentEntity" | "['infores:efo']" |
| "EFO:0000635" | "organism part" | "biolink:AnatomicalEntity" | "['infores:efo']" |
| "EFO:0000663" | "pool" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "EFO:0005060" | "instrument part" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "EFO:0005066" | "collection of material" | "biolink:MaterialSample" | "['infores:efo']" |
| "BTO:0000214" | "cell culture" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "EFO:0004423" | "exon" | "biolink:Exon" | "['infores:efo']" |
| "EFO:0004422" | "exome" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "EFO:0004420" | "genome" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "EFO:0004446" | "biological macromolecule" | "biolink:MolecularEntity" | "['infores:efo']" |
| "EFO:0000324" | "cell type" | "biolink:Cell" | "['infores:efo']" |
| "EFO:0000548" | "instrument" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "EFO:0000469" | "environmental factor" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "EFO:0010579" | "proteome" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "OBI:0000245" | "organization" | "biolink:PhysicalEntity" | "['infores:efo', 'infores:foodon', 'infores:genepio']" |
| "MPATH:0" | "pathological entity" | "biolink:BiologicalEntity" | "['infores:efo', 'infores:genepio', 'infores:hpo']" |
| "OBI:0000427" | "enzyme" | "biolink:Protein" | "['infores:efo', 'infores:genepio']" |
| "BTO:0001384" | "tissue culture" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "EFO:0004359" | "telomere" | "biolink:PhysicalEntity" | "['infores:efo']" |
| "OBI:0000181" | "population" | "biolink:PhysicalEntity" | "['infores:efo', 'infores:genepio']" |
| "BTO:0002690" | "biofilm" | "biolink:PhysicalEntity" | "['infores:efo']" |
Many of these seem to be problematic.