OBOFoundry.github.io icon indicating copy to clipboard operation
OBOFoundry.github.io copied to clipboard

Review criteria for admitting an ontology to the OBOF

Open cstoeckert opened this issue 2 years ago • 6 comments

This issue is to provide a draft stating criteria for ontologies to meet in order to be added to the OBO Foundry (OBOF). The topic has been discussed many times during operations committee calls and a request to start documenting the criteria was made at the April 19, 2022 call.

Historical context: The OBOF has “reviewed” and library ontologies. The former are typically established reference ontologies and involved manual review for meeting OBOF principles. Library ontologies were admitted if committed to following OBOF principles and were not obviously in conflict with existing OBOF ontologies. The reviews and decisions to admit were done on a case by case basis. There have been cases of not admitting ontologies that were simply repeats of existing resources or not scientifically based. In the past couple years there has been a movement away from this approach motivated by a desire to get away from the reviewed versus library status and the introduction of the Dashboard to programmatically check meeting principles.

Current review practice: Progress has been made towards standardization of requirements to become part of the OBOF. An issue needs to created at https://github.com/OBOFoundry/OBOFoundry.github.io/issues and a detailed template (https://github.com/OBOFoundry/OBOFoundry.github.io/issues/new?assignees=&labels=new+ontology&template=new-ontology.yml&title=Request+for+new+ontology+%5BNAME%5D) needs to be satisfactorily filled out. The template includes a pre-registration checklist that essentially requires the submitter to agree to OBOF principles to check all the boxes. The process includes passing the provisional Dashboard (https://obofoundry.org/obo-nor.github.io/dashboard/index.html). It is possible to still have an ontology submitter agree to everything and the ontology pass the Dashboard but the ontology not actually be logically consistent, scientifically accurate, or follow the principles as intended. A review of some kind is made as part of discussion for admittance by the OBOF operations committee.

OBOF operations committee discussion points to be applied in future reviews:

  1. We don’t want to admit ontologies whose use of imported terms or creation of new ones are problematic (e.g., inappropriately place imported classes, create/apply object properties that don’t make sense). Such logical inconsistencies will prevent interoperability with other OBOF ontologies and cause confusion if others try to use it.

  2. We don’t want to admit ontologies that claim to cover a domain but whose contributions are problematic (e.g., don’t really cover the area claimed, make assertions that are inaccurate). Such ontologies will be scientifically untrustworthy for others to use and will make it hard for other ontologies to provide better coverage of the domain.

  3. We don’t expect new ontologies to be perfect but we do expect them to be responsive to obvious or widespread problems. If the submitter makes good faith efforts to respond to identified problems then the ontology should be admitted.

A report from a reviewer could use these points to indicate whether there are wide-spread or glaring logical problems and/or serious coverage and accuracy problems. If there are such problems, then the ontology would need to demonstrate good faith effort through visible changes to the ontology before admittance.

cstoeckert avatar Apr 22 '22 16:04 cstoeckert

Can we rephrase this to be more concise and actionable

I suggest leaving our phrases like "Progress has been made towards standardization of requirements to become part of the OBOF" that are implicit

It is possible to still have an ontology submitter agree to everything and the ontology pass the Dashboard but the ontology not actually be logically consistent, scientifically accurate, or follow the principles as intended

Surely a logically inconsistent ontology will always be detected by the dashboard?

"We don’t want to admit ontologies" -- what does this mean? Consider ISO language SHOULD/MUST/etc

use of imported terms or creation of new ones are problematic (e.g., inappropriately place imported classes, create/apply object properties that don’t make sense).

"problematic" is very subjective. We have specific tickets for the examples given (inappropriately place imported classes / axiom injection), we should fast track these into concrete guidance

What does it mean for an object property to not make sense?

Such logical inconsistencies will prevent interoperability with other OBOF ontologies and cause confusion if others try to use it.

The examples given are not logical inconsistencies - this term has a very precise meaning for OWL and ontologies and we shouldn't confuse our users by mixing these.

For this and the other criteria there should be clear examples and counter-examples

cmungall avatar Apr 22 '22 17:04 cmungall

I'm also skeptical of any criteria that can't be evaluated programmatically. There's only so much effort that people are willing to put into doing and then writing up a subject review of an ontology, which is again subject to their own experience. Additionally, we should be thinking about better ways of pushing the burden of ontology evaluation onto the submitters that can be evaluated in a structured way. For example, in https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1819#issuecomment-1084581942, the ontology submitter probably didn't think very much about whether there was a more appropriate ontology for their efforts, but there's no record of this either way.

cthoyt avatar Apr 25 '22 10:04 cthoyt

Some text and thoughts / text for consideration. I tried to toss in some of the points that have stuck with me over the past couple of years, but treat these as notes rather than firm positions. I think we should soon move the draft text to a GDoc or similar to make the review more fluid/easier to handle, porting it back here when it starts to stabilise.

I'm pre-supposing that we will have some review criteria and we won't be blindly inclusive.

For this and the other criteria there should be clear examples and counter-examples

We can add examples and counter-examples when the criteria / guidance is settled unless we need them sooner to clarify things internally.

Surely a logically inconsistent ontology will always be detected by the dashboard?

The dashboard wasn't calling any reasoning errors for DISDRIV https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1508, as no axioms were there for the inconsistencies to be detected.

"problematic" is very subjective. We have specific tickets for the examples given (inappropriately place imported classes / axiom injection), we should fast track these into concrete guidance

A list of these would be helpful. I agree that these should be translated from experience into guidance, but we should all help compile the list.


Reviewing ontologies for OBO membership

Background

The OBO Community has, historically, distinguished between "Library" and the more rigorously reviewed "Foundry" ontologies.

The OBO Library served as a collection of experimental, fledgling, highly specialized (i.e. designed for use by a very specific community or in a specific project), or similar resources which were of interest to the community, but not designed or usable as generic, reference ontologies for the wider community. Ontologies that were admitted into the Library still, however, attempted to align to the OBO Principles, especially in avoiding thematic/content overlap with existing ontologies in both the OBO Library and Foundry (in favor of reuse).

Foundry ontologies were those that had been manually reviewed by (typically senior) members of the OBO Community for more strict compliance to the OBO Principles and suitability as generic references usable across many projects and/or communities. More emphasis was placed on reusing content from other OBO ontologies and accommodating the requests of their user base.

The distinction between Library and Foundry ontologies was and is multifaceted, contributing to the difficulty of consistently maintaining this distinction. Thus, reviews and the decisions to admit an ontology into the Library or Foundry were done on a case by case basis, often in ways and/or with reasoning that were difficult to accurately document or communicate to the OBO Community at large. To improve consistency and clarity, the OBO Operations Committee has been discussing how to both document and refactor its review processes to be more transparent, reproducible, inclusive, and accurate.

In this document, we will develop a consensus on both the categorisation of ontologies in the OBO Community and the standard operating procedure we will use to evaluate proposed additions to the OBO space.

Types of ontological artifacts in OBO

NB: "Ontological artifact" is used as we may and do have resources in OBO that are not ontologies in the strict sense (often for good reason), such as the NCBI Taxonomy and the NCI Thesaurus.

The typology of ontological artifacts noted below is an "unpacking" of the key elements that defined resources in the "Foundry" and "Library" categories. One or more of the types listed below should be applied to new and existing artifacts (e.g. as tags) to make their scope and intent clear. Reviews can then be more focused on evaluating what the artifact claims to be capable of and intended for, and users more aware of the nature of the resource they are using.

Artifact by semantic expressivity (assuming all OBO resources should follow the genus-differentia model: subclasses always inherit and never lose the attributes of their superclasses. This is key to ensuring cross-resource interoperability and importability).

  • Glossaries built on ontological principles: An ontology-like artifact where textual definitions are informal and/or deliberate and strong simplifications of technical terminology, but which still follow genus-differentia rules.
  • Structured vocabularies built on ontological principles: An ontology-like artifact which prioritizes the arrangement of terms themselves - with minimal or terse definitions - following genus-differentia rules.
  • Ontology: An artifact which uses formalised approaches to represent knowledge through rigorous and logically consistent textual and logical (axiomatic) definitions of terms, linked together with defined relations.

[Add thesauri/taxonomies built on ontological principles?]

Artifact by mission

  • Community artifact: An artifact which is driven by a defined community of users and developers.
  • Domain expert-led artifact: An artifact which has content created, curated, or otherwise sourced from and validated by experts in the domain the ontology targets.
  • Reference artifact: An artifact which has been designed to (and is maintained to) serve as a generic reference for a domain, wherein the developers/maintainers attempt to reconcile more narrow usages of terminology (e.g. due to regional, disciplinary, or linguistic conventions) to provide globally robust semantics.
  • Project-centric / bespoke artifact: An artifact which has been developed to support the needs of a single project or application over any other. Terms, definitions, and other properties of these ontologies will prioritize the views and or needs of the project it is embedded in.
  • Experimental artifact: An artifact which is of interest to the OBO Community, but which has one or more experimental or unstable components that users must be aware of before (re)use.

[ADD AS NEEDED}

Guiding questions for review

  1. Was the artifact developed using expert input or trusted scientific sources representative of the consensus in its target domain of knowledge?
  • If the artifact was developed for a very specific purpose or community, representation and consensus need not be broad; however, this scope should be clearly stated and preserved during the lifetime of this artifact unless a re-review is called.
  1. Is the artifact available on the web in a valid serialization (e.g. OWL in XML, RDF, or functional syntax) that can be opened and inspected by open-source tools (e.g. Protégé)?
  2. Is the artifact's metadata informative and sufficient to understand its scope, contributors, and [TODO: add minimal criteria]
  3. Does the artifact pass the automated checks of the OBO Dashboard?
  4. If the automated checks are passed, does manual inspection of the artifact validate these passing marks? (this is especially relevant for principles that are hard to exhaustively evaluate automatically such as passing reasoning or the placement of imported terms)
  5. Does the artifact accurately - in both a technical and substantive sense - reuse terms from other OBO ontologies?
  • If terms are reused, is there textual and (where relevant) axiomatic definition preserved?
  • Are imported terms in appropriate hierarchies? That is, has the import of the term preserved its upper-level alignment?
  • Are any additional axioms used for these terms correct in both a technical (e.g. passes reasoning) and substantive sense?
  1. TODO: Continue the list...

Post-review / re-review

Following successful admission, artifacts may be subject to re-review on request from their authors and/or users. The same criteria expressed above will be used. Re-review is recommended should the artifact go through any major changes or if its scope changes. This is especially true if the scope changes such that the artifact has domain overlap with another artifact in the OBO collection.

pbuttigieg avatar Apr 27 '22 13:04 pbuttigieg

see also: #1140

nlharris avatar May 24 '22 01:05 nlharris

In extension to the above, I want to reiterate my position on the matter:

I have been advocating lighter human reviews and a lower bar for admission, and much higher standards for ontologies already in the foundry to be considered reference with ongoing checks. Here is what I would like concretely:

  1. Admission to OBO Foundry Registry is reasonably open, with a few formal automated checks (Dashboard) and a cursory manual review (1 - 2 hours by a human reviewer). The reviewer makes issues (on issue tracker of ontology) about everything they found (the above suggestions by @cstoeckert and @pbuttigieg can be tweaked but are mostly ok). @alanruttenberg suggests reviewing a random sample of axioms for a big ontology #1919, and I agree with that idea -- but I don't think we need to iterate until all axioms are fixed. The admission criterion is not the fixing of the bad modelling, but the responsiveness demonstrated in addressing the reviewer's concerns (objective). So the acceptance criterion is: all issues the reviewer flagged up are addressed/fixed (responsiveness principle), not "all axioms are correct".
  2. There is an ongoing QC process that checks for (1) COB compliance (2) non-overlapping term scopes and (3) coherency. Only ontologies that fulfill both can get full "reference ontology status". This will take care of 80%-90% of all bad modelling in practice.

COB compliance means two simple things:

  1. All terms in the ontology must inhert from a COB term (if one does not exist, it must be added first)
  2. Merging the ontology with COB does not result in unsats (i.e you cant have a chemical entity in an exposure to chemical branch in the ontology).

Non-overlapping term scopes means two simple things:

  1. There cannot be another term with the same label (sameness here based on a lower-cased lemmatised string) in another OBO reference ontology. I.e you cant have add a new "Alzheimer's Disease" class to your ontology because DO already has one.
  2. And this is the key: You cannot add terms, even if they are new, to a branch of the ontology that is serviced by another ontology according to COB. What does this mean? If OBI is the ontology that, according to COB metadata, deals with assays, you cannot simply add a few assay terms into your ontology - you must demonstrate you have tried to work with the OBI team to get these assay terms into OBI.

Coherency means that an ontology must have no unsatisfiable classes when merged with all of its dependent OBO ontologies. A "dependent OBO ontology" is defined as "any ontology in OBO from which the depending ontology uses one or more terms".

Some details need to be hashed out, like ontologies with already overlapping scope (DO, Mondo, NCIT, OMIT), but this is immaterial - its just the principle.

matentzn avatar May 27 '22 05:05 matentzn

Where are we at with this? Do we want to discuss this during the governance-related call (whenever we manage to schedule that)?

nlharris avatar Jul 25 '22 20:07 nlharris

We agreed during the call that this ticket can be closed. Review criteria are now codified, but we still need to discuss "Foundry status". Someone will make a new ticket for that when there's a coherent proposal.

nlharris avatar Sep 06 '22 16:09 nlharris