Principle #3 IDs - automated validation
FP 3 - URI/Identifier Space
Automated checks:
- Core terms must follow
NAMESPACE_NUMIDformat
Mechanism:
Any IRI that starts with http://purl.obolibrary.org/obo/IDSPACE must end with _NUMID. the IDSPACE comes from the registry. All things that start with http://purl.obolibrary.org/obo/ must be a valid registry ID.
~~There may be external classes that we can't predict. But, if the ontology uses oboInOwl:hasOBONamespace on terms, we can check those entities. The value of that annotation should match NAMESPACE (ignoring case).~~
~~If any class annotated with oboInOwl:hasOBONamespace in the core namespace does not follow NAMESPACE_NUMID format (e.g. 'doid' namespace follows DOID_0000001 etc.), throw a warning.~~
I know some ontologies may use text in the identifier of properties, so maybe that can be an info message? It is currently an error
Should ontologies be allowed to annotate external terms with hasOBONamespace in their namespace? For example, ARO has some external terms (e.g. from DOID) with the namespace value 'antibiotic_resistance'.
In cases where the namespace annotation isn't used, we can just check that classes use numeric format if the IRI contains /obo/NS_ (where NS is the actual namespace)
All IRIs must be unique. If an IRI is duplicated, the annotations will be merged in OWL. Duplicate labels and definitions may be a sign of two different terms with the same IRI. Labels and definitions may be duplicated for other reasons, though.
Some ontologies do not use numeric identifiers for everything (e.g. PR).
In the future, we are aiming to have base artefacts for each ontology. The base only contains the terms in that ontology's namespace.
Revised check:
Any IRI that starts with http://purl.obolibrary.org/obo/IDSPACE must end with _NUMID. the IDSPACE comes from the registry. All things that start with http://purl.obolibrary.org/obo/ must be a valid registry ID.
I don't like relying on oio:hasOBONamespace
On Thu, Aug 15, 2019 at 2:10 PM Becky Jackson [email protected] wrote:
Should ontologies be allowed to annotate external terms with hasOBONamespace in their namespace? For example, ARO has some external terms (e.g. from DOID) with the namespace value 'antibiotic_resistance'.
In cases where the namespace annotation isn't used, we can just check that classes use numeric format if the IRI contains /obo/NS_ (where NS is the actual namespace)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1017?email_source=notifications&email_token=AAAMMOMPTCN6PRY646Y4W63QEXA5BA5CNFSM4IKVI3B2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4NAKTA#issuecomment-521798988, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAMMOI2ZOP6DBBZQVML5ELQEXA5BANCNFSM4IKVI3BQ .
On the call on Thursday, we revised this to my comment above so that oboInOwl:hasOBONamespace is no longer involved. I'll update the mechanism in my original post to reflect this.
For historical reasons, annotation properties may uses hashes (e.g. subset definitions). Object and data properties should not.
This check will ignore annotation properties and apply only to classes, object, and data properties.
The IRI must start with http://purl.obolibrary.org/obo/IDSPACE_.
It is recommended that IRIs end with NUMID. If the identifier following _ is not numeric, we will issue a warning. This requires review to ensure that the identifiers don't include semantics.
Update.
The counterexample doesn't really apply. NCIT is not necessarily an OBO Foundry ontology.
There is an OBO version of NCIT that does use NCIT_ namespace. e.g:
http://purl.obolibrary.org/obo/NCIT_C12218
AFAIK: this principle is to apply to OBO Foundry ontologies.
Question: At present we don't allow numbers in the namespace. I think we should consider it. For example, should OBO:COVID-19_ be allowed?
cc @bpeters42
This issue is only for automatic validation. Please move other discussion of the principle to issue #954, which I am about to update with the latest text.
Question: At present we don't allow numbers in the namespace. I think we should consider it. For example, should OBO:COVID-19_ be allowed?