obi
obi copied to clipboard
Clarification regarding identifier modeling pattern in OBI (and OBO) needed
The problem
There are identifieres, e.g. DOI, PubMed ID or specimen identifier, subsumed under IAO:symbol & IAO:centrally registered identifier symbol in OBI, although there is also the class IAO:identifier & IAO:centrally registered identifier. While some OBO ontologies follow OBI, reusing these classes and/or defining new identifiers also under IAO:symbol or IAO:CRID symbol, some use IAO:identifier and IAO:CRID identifier to do so.
This discrepancy has historical reasons, since IAO:identifier most likely wasn't available back when these identifiers were coined in OBI, as it was first defined in PNO using the IAO namespace and was only recently merged into IAO (https://github.com/information-artifact-ontology/IAO/issues/236). I feel a bit bad having been the one starting that overdue integration proposal and now also am the one pointing to the not yet addressed consequences it entails. But I now know more about previous OBO work and that this issue has been lingering around for more than a decade (see the very important context here & here). So, with the pending integration of PNO into IAO, I think we have a great opportunity to work on harmonizing the different currently existing practices of modeling identifiers either as subclasses of IAO:symbol or of IAO:identifier by making really clear what the difference between IAO:symbol and IAO:identifier are. I already referred to this discrepancy in this now closed IAO issue and there @zhengj2007 suggested that writing an issue here might be best, although it is equally relevant for IAO and other ontologies reusing and or defining identifiers.
I've tried my best to gather the information I could find regarding this issue (mostly by using OLS and following links in editor notes/comments or GH issues/PRs), which will hopefully allow us to sort things out in a way most or hopefully all can agree on.
Which OBO ontologies follow the OBI pattern, using IAO:symbol as parent for identifiers?
- CHEMINF defines alot (mostly chemical DB ids) under IAO:centrally registered identifier symbol
- see EUPATH:IRB number
- see children GENEPIO's IAO:symbol
- ICO:review board approval number
- LABO:medical record identifier
- I guess the whole NOMEN - A nomenclatural ontology for biological names
- ORNASEQ:experiment name
- in PDRO children of IAO:centrally registered identifier symbol
Which OBO ontologies follow the PNO pattern, using IAO:identifier as parent for identifiers?
- In Apollo_SV, subsumed under IAO:identifier we not only find a duplicate to the most prominent example of the PubMed identifier, but also that terms imported from OBI have been changed to be children of IAO:identifier, such as grant identifier or digital object identifier
- DIDEO:drug concept set identifier
- OHD:current dental terminology code
- in PROCO three children under IAO:centrally registered identifier (including likely duplicate to CHEMINF:CAS registry number)
Which OBO ontologies use a mix of either IAO:symbol or IAO:identifier as parent for identifiers?
- EUPATH:IRB number as IAO:symbol and geohash code as IAO:identifier
- in OBIB we have the two IAO:symbols courier tracking number & identification number of a protocol at a particular site (which looks very similar to EUPATH:IRB number) and quite a lot of children of IAO:identifier
- in OMRSE see the children of IAO:symbol & social security number
- OPMI:centrally registered study identifier symbol and children of IAO:centrally registered identifier
Which OBO ontologies use identifier like classes defined directly under IAO:ICE?
- ONS:sampling identifier, ONS:event identifier name
- ONTONEO:identifier code a seemingly still work in progress --> problematic anyhow, as it reuses the deprecated relation property "designates" imported from a idk how old OMIABIS
Probably not relevant for OBO?
In AFO we can see the "mixup" of what is considered an identifier by looking at how they extend IAO:symbol as well as IAO:identifier. What is most obvious is the fact that they don't update upstream changes regularly (which was confirmed to my via email from the company that is contracted by Allotrope to do the ontology dev work).
Other not yet refenced and related OBI issues I found
- https://github.com/biobanking/biobanking/issues/88
- https://github.com/obi-ontology/obi/issues/1246
probably other historically relevant context: http://icbo.buffalo.edu/Presentations/Ruttenberg.pdf ICBO2012 presentation?
- relevant section on identifiers starting with page 65
- relevant section on symbols startig with page 96
In how far is CEUSTERS 2012 - An Information Artifact Ontology Perspective on Data Collections and Associated Representational Artifacts relevant for this issue? Could "term" and "denotator" be helpful destinctions in the clarification?