OBOFoundry.github.io icon indicating copy to clipboard operation
OBOFoundry.github.io copied to clipboard

New metadata tag: "defective" (ontology status)

Open matentzn opened this issue 2 years ago • 16 comments

Ok the title of this issue is a bit catchy, the metadata element I suggest here is "invalid", but I wanted you to open this issue and read it.

This is the gist:

  1. GAZ as a whole is broken, but used
  2. A part of GAZ is actively maintained, the rest is not (currently)
  3. Fixing GAZ is too costly, but a number of OBO Operations people don't want it to disappear from the obofoundry pages

Currently, GAZ is hidden by having it flagged as "inactive".

A number of OBO Operations committee members have favoured setting GAZ back to active, to reflect the fact that "a part of it is actively maintained". This led to a discussion that resulted basically in the assertion that "active" status should not be coupled with any specific QC rules - basically pushing against #2120. I (me) decided not to put this to a vote for now, because of the social capital the vote will cost.

I hereby propose a new, optional metadata element:

is_valid_ontology: TRUE/FALSE

With the following definition:

A valid ontology is (1) parseable and (2) non-empty (at least one axiom or annotation assertion) and (3) logically consistent (not necessarily coherent).

The flag can be set by OBO foundry without any consent from the ontology maintainers if the file has been demonstrably broken for more than 3 months. This ontology status augments the current status of "obsolete, active, inactive, and orphaned" with a technical dimension. The moment the ontology is demonstrably non-broken, OBO Foundry must change the flag to TRUE, or remove the metadata element.

Consequence of is_valid_ontology: FALSE:

  1. invalid ontologies are excluded from the OBO Dashboard to avoid computation and processing overhead
  2. invalid ontologies get a suitable warning on the OBO foundry ontology table

matentzn avatar Apr 28 '23 13:04 matentzn

We need this!

Would this include ontologies like micro that have invalid rdfxml (despite being parsed by owlapi)

cmungall avatar Apr 29 '23 11:04 cmungall

Suggestion on call: change is_valid_ontology --> defective

cc @balhoff

matentzn avatar May 02 '23 16:05 matentzn

@matentzn can you give a quick description of what 'logically consistent' means here, and how it would be assessed? If via the dashboard, then that moves a bit away from the original intent of the new status flag (that it cannot be assessed). Based on past discussion I'd say 'has parseable content' would be the label that best captures the intent.

nataled avatar May 02 '23 16:05 nataled

Reasoner: if the reasoner says: "inconsistent", then its logically so broken that you cant do anything with it!

"has parseable content" is not enough, because you can be parseable and not consistent.

matentzn avatar May 02 '23 16:05 matentzn

@matentzn got it. There are several reasons why I'm struggling with the label (and what lies beneath):

  1. As proposed, this flag will be quite different compared to the others. All others give relatively judgement-free indications about the ontology. That is, there's no aspect of content assessment implied in the status label. That's why I suggested 'has parseable content' (or, more simply, 'is assessable'); it's a status that doesn't reflect the informational 'goodness' of the ontology.

  2. Having the one status that captures two very different things doesn't properly inform the user of the issue. This is why I mentioned the possibility of separating into two flags, one for parsing and one for reasoning.

  3. I believe that 'logical consistency' should be its own principle, and thus be reflected on the dashboard in a more obvious way. I would prefer keeping all metadata and content assessments under the purview of the dashboard. I do realize going that route will take much longer to implement, so I'm sure there would be resistance to that idea.

  4. I could argue that "can't do anything with" a logically inconsistent ontology is a bit strong. Indeed, it's likely that many terms are still fine and can be imported judiciously. This is why I hesitate to use words like 'invalid' or 'broken' or 'defective'--they imply that the whole ontology is non-usable. Not to mention that people have their own ideas about what these words mean. Indeed, some ontologies are both assessable and logically consistent, but nonetheless have been described as defective (or a synonym thereof).

We started on this idea over concern about overloading the meaning of one status. My concern is that this proposal adds a different overloaded status (at least by name).

All that being said, I favor the use of terminology for these statuses that are 'neutral'. If I were to separate out the ideas and make them as neutral as possible, I'd go with something like 'works with dashboard' (or 'amenable to parsing') and 'works with reasoner' (or 'amenable to reasoning'). Perhaps a good combined status would be 'amenable to parsing & reasoning' (though that seems rather long).

nataled avatar May 02 '23 18:05 nataled

OK. @nataled while I have a mild tendency to not overload the set of metadata elements too much, I am not strongly opposed to your suggestion of separating the two status flags:

is_parseable: The ontology can be parsed using ROBOT, and contains at least 1 axiom. Note that some ontologies are partially parseable, and others only appear to be parseable, but really are not (for example, this ontology is parseable: <Ontology></Ontology>). The flag is strictly to separate entirely unparseable ontologies.
is_inconsistent:  The ontology is logically inconsistent according to a DL reasoner like HermiT or ELK. Being inconsistent means that the ontology is not amenable for automated reasoning, a (the!) key promise of ontologies compared to simple vocabularies. To be interoperable, an ontology must be consistent: Ontologies that are importing an inconsistent ontology become inconsistent themselves. 

Lets see what other people think.

I believe that 'logical consistency' should be its own principle

I agree. A principle like this would be very welcome, and is key for our GUOBO vision (the Grand Unified OBO ontology. I would love to contribute to a description of such a principle.

it's likely that many terms are still fine and can be imported judiciously

This is only true if you use MIREOT imports, and treat the ontology as a vocabulary, and not an ontology. If you use semantics aware extraction methods like SLME, you cannot guarantee to be able to extract a module from an inconsistent ontology!

matentzn avatar May 03 '23 10:05 matentzn

One ontology can be processed by one reasoner (ELK) and not the other (Hermit). Should we specify ELK as the "bottom line" reasoner for logical consistency? In addition, what about unsatisfiable classes ? Should they also be flagged as inconsistent as they could wreck havoc in ontologies that import them?

pfabry avatar May 09 '23 19:05 pfabry

One ontology can be processed by one reasoner (ELK) and not the other (Hermit). Should we specify ELK as the "bottom line" reasoner for logical consistency?

I think we should focus on a conceptual definition of "inconsistent" here - there is a clear logical notion of inconsistency. Some reasoners will be incomplete, of course. The question of which reasoner to use should be an afterthought we debate on the OBO Dashboard tracker in my opinion.

In addition, what about unsatisfiable classes ? Should they also be flagged as inconsistent as they could wreck havoc in ontologies that import them?

I would love that, but I see this as a second step. I don't think we should include this in the primary ontology metadata / but definitely in the OBO Dashboard.

matentzn avatar May 10 '23 10:05 matentzn

I think we should focus on a conceptual definition of "inconsistent" here - there is a clear logical notion of inconsistency. Some reasoners will be incomplete, of course. The question of which reasoner to use should be an afterthought we debate on the OBO Dashboard tracker in my opinion.

This is not a conceptual definition, but maybe a useful proxy would be the validate-profile ROBOT command: An ontology is logically inconsistent if it produces a DL profile violation error.

EDIT: Unless this is already done within the ROBOT report ?

pfabry avatar May 10 '23 14:05 pfabry

It's not really the same thing though, it only tells you if the ontology is valid owl syntactically, while what we are after is a semantic definition! What are your concerns with adopting the description logic notion of logical consistency?

matentzn avatar May 10 '23 15:05 matentzn

My understanding of logical consistency is that for an ontology to be logically consistent all its axioms should be true. From a practical point of view, that means that no classes must be inferred under owl:nothing ? If so, I have no concern.

pfabry avatar May 10 '23 20:05 pfabry

The precise definition of inconsistency is that owl:Thing is a subclass of owl:Nothing. This furthermore implies everything, which means that all axioms become true. What you are talking about we usually refer to as coherence. An incoherent ontology is a consistent ontology that has one or more unsatisfiable classes.

matentzn avatar May 10 '23 20:05 matentzn

Thank you for the precision. In my opinion, it would be worthwhile to have the definitions of inconsistency, incoherence, unsatisfiability, validity, etc. written somewhere (at least in the glossary of OBO Academy) so that we can all refers to the same meaning of the terms.

pfabry avatar May 11 '23 14:05 pfabry

From OFOC meeting 2023-07-11:

Nico: rather than coming up with new tags, maybe figure out how to use dashboard results to drive information in the OBO website (e.g., incoherent in ELK) Darren: Tags that imply a problem (such as this proposed ‘defective’ tag) could be displayed on the website (as is currently done for the status tags) but have a link to either the dashboard (if appropriate) or to some automatically-generated text based on dashboard results (could even be a mouse-over...?).

nataled avatar Jul 11 '23 17:07 nataled

What is the status of this? Too big a can of worms to resolve easily?

nlharris avatar Apr 29 '24 05:04 nlharris

Low ROI

matentzn avatar Apr 29 '24 07:04 matentzn