ModelPolisher
ModelPolisher copied to clipboard
Identifiers.org new identifiers pattern lead to identifier duplicates
My model contains CURIE links for identifiers.org with the pattern prefix:identifier. After using ModelPolisher I saw that the BiGG identifiers which had the aforementioned pattern are now duplicated in the model. For example:
<rdf:li rdf:resource="https://identifiers.org/bigg.metabolite:10fthf"/>
<rdf:li rdf:resource="https://identifiers.org/bigg.metabolite/10fthf"/>
Both of these CURIE links work. This indicates that currently two patterns for the CURIEs used in the identifiers.org links, e.g. prefix/identifier and prefix:identifier are usable. However, the latter one seems to be the newer pattern as only this pattern is listed in identifiers.org as well as bioregistry.io.
Due to all BiGG identifiers being duplicated or more precisely being added with the pattern prefix/identifier to the model if the identifiers are not present in this form the problem seems to be caused by ModelPolisher. In case the problem is caused by annotateDB the issue regarding this problem within annotateDB can be found here: https://github.com/matthiaskoenig/annotatedb/issues/33.
I could not reproduce this behaviour using the current state of the 2.1 branch, however I am not quite certain at what point it has been changed/fixed.
I'll add the 2.1 tag to this issue and will consider it fixed when the 2.1 release is done.
Here is the relevant test:
https://github.com/draeger-lab/ModelPolisher/blob/113b46764b2df15a0333e6a1244fa1215cee7c39/lib/src/test/java/de/uni_halle/informatik/biodata/mp/annotation/bigg/BiGGSpeciesAnnotatorTest.java#L233