ontology-development-kit
ontology-development-kit copied to clipboard
Ideas needed: Workflows to keep edit files clear of foreign axioms
Terminology: native axioms are axioms belonging to an ontology; foreign axioms belong to another ontology and are imported. Typically native axioms are:
- all axioms which mention at least one entity from the ontology namespace
- axiom annotations on other native axioms
Foreign axioms are everything else. Example: in MP
- MP1 sub BFO1 is native
- BFO1 sub BFO2 is foreign
- BFO1 rdfs:label "hello" is foreign
Exceptions could be xrefs:
- BFO1 xref: "MP:1" is foreign
- BFO1 xref: MP1 should be foreign
No matter what I tried, foreign axioms keep creeping in all the edit files I am supporting, probably due to some weirdnesses in Protege, etc. These foreign axioms can confuse ROBOT validation as well; For example if you have a declaration of a GO class in your ontology, but no label; But more importantly, they may become outdated, refer to later obsoleted classes etc; so we really dont want them. But how do we manage their removal in a super simple way?
Potential ideas:
- I was thinking to add a command in ODK to simply print foreign axioms with SPARQL/ROBOT. That way people can simply choose to remove them manually. The downside is that people who only ever look at the ontology in Protege (there are many!) may have difficulties to really clear out axioms such as declarations or, god forbid, gcis..
- Ther could be a ODK command that cleans the edit file for foreign axioms. The downside is that this would involve the ODK touching the edit file; I liked the fact that this never happened, because it protects the edit file from unintended changes due to some serialisation issue; but maybe that is over protective?
Any other idea? @cmungall @dosumis @balhoff
Yeah, @beckyjackson and I need something like this soonish. The immediate use case is to convert a released OWL file to something like a base artifact.
I could see this in ROBOT as remove/filter --axioms foreign
or something. My first thought is to get the signature of each axiom as a set of IRI strings, and if none of those IRIs are in the current ontology's namespace (by comparing string prefixes) then this axiom is foreign. (Note that ROBOT isn't yet aware of "the current ontology's namespace".)
So what we currently do is this:
- We allow the users to specify "namespace of interest"; see CL for an example.
- Then we use this to create a SPARQL query that selects all terms conforming to that spec
- We then use
filter --term-file $(SIMPLESEED) --select "annotations ontology anonymous self" --trim true --signature true
(example)
To obtain only the "native" axioms. What do you think of that in general?
@matentzn @beckyjackson what do you think about external
rather than foreign
?
I just opened a WIP PR here: https://github.com/ontodev/robot/pull/570
@balhoff the current WIP uses foreign
but I think external
is a good alternative. What do others think?
Happy with external/internal lingo!
I prefer external/internal
I think we need to be clearer about the definitions. What about equivalence axioms between two NCs?
@matentzn can you talk more about the circumstances in which you see these creeping into the edit file? I have seen this with declarations, but not with non-declaration axioms
The most important problem is that our current SOP requires the curator to create a new term as a subclass of thing; for example, let's say I want to add a logical definition involving GO:001, and GO:001 is not currently present in the GO module I am importing. The current SOP says: add GO:001 (the IRI) under Thing, and then use it in the logical definition. This creates two foreign axioms: The declaration, and the subclass of owl:Thing axiom, which need to be, periodically, cleaned up. I guess this problem will go away once everything is managed through DOSDP, but that is not going to be tomorrow, and this will only solve the problem for a handful of pattern-based phenotype ontologies..
I guess the problem could be solved merely through patience and training, and a failing pipeline if axioms are introduced accidentally (Protege bugs, or people accidentally drag-dropping a class in the external part of the hierarchy).. I dont mind. This is not a very high priority for me. And I am ok with being overruled on the ROBOT ticket as well; I just wanted to outline what my priorities are with regards to this functionality: I would like one quick method to divide the ontology into an external and internal part, given a set of "internal namespaces"; Around that, I can then easily design my QC framework.
(Remember, many curators wont run the release pipeline to refresh imports, but they often need a way to use external terms right away - hence the current way to do it with adding a declaration in Protege)