OBOFoundry.github.io icon indicating copy to clipboard operation
OBOFoundry.github.io copied to clipboard

Principle #2 common format - automated validation

Open beckyjackson opened this issue 6 years ago • 12 comments

FP 2 - Common Format

Automated checks:

  1. The OWL PURL must resolve to RDF/XML

Mechanism:

We can ensure that the ontology properly loads in ROBOT, but this does not confirm that the format is RDF/XML. Unfortunately, it seems like the format data is lost after the ontology is loaded with the OWLAPI OWLOntologyManager. We can check the first line of the file to see if it starts with <?xml version=. I'm open to other suggestions here.

beckyjackson avatar Aug 09 '19 16:08 beckyjackson

Unfortunately, it seems like the format data is lost after the ontology is loaded with the OWLAPI OWLOntologyManager.

@beckyjackson have you tried this method?

http://owlcs.github.io/owlapi/apidocs_4/org/semanticweb/owlapi/model/OWLOntologyManager.html#getOntologyFormat-org.semanticweb.owlapi.model.OWLOntology-

balhoff avatar Aug 09 '19 17:08 balhoff

Yes, unfortunately it returned null after loading with the ROBOT IOHelper 🙁 Do you know a way to keep that information @balhoff ?

beckyjackson avatar Aug 09 '19 17:08 beckyjackson

No, sorry, I thought that would work! It's not something I have used before though.

balhoff avatar Aug 09 '19 20:08 balhoff

The root node should be rdf:rdf - @jamesaoverton

beckyjackson avatar Aug 15 '19 21:08 beckyjackson

Jena?

I think at a minimum it should parse using Jena, which gives us a guarantee that it's in some RDF format, which is a more pragmatic requirement than the stricter RDF/XML IMHO

On Thu, Aug 15, 2019 at 2:27 PM Becky Jackson [email protected] wrote:

The root node should be rdf:rdf - @jamesaoverton https://github.com/jamesaoverton

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1018?email_source=notifications&email_token=AAAMMOL3JOPX5M5R6XVHWKDQEXC5DA5CNFSM4IKVL6SKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4NBQ4I#issuecomment-521803889, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAMMOPFBR2CCVRUVRJKF3TQEXC5DANCNFSM4IKVL6SA .

cmungall avatar Aug 19 '19 15:08 cmungall

Even though RDF/XML is not my preferred format, I like the predictability of the strict rule.

jamesaoverton avatar Aug 19 '19 18:08 jamesaoverton

Wait, I just thought, surely we can use the owlapi with only the rdfxml parser registered? If it fails, then it's not valid rdfxml

On Mon, Aug 19, 2019 at 8:00 AM Chris Mungall [email protected] wrote:

Jena?

I think at a minimum it should parse using Jena, which gives us a guarantee that it's in some RDF format, which is a more pragmatic requirement than the stricter RDF/XML IMHO

On Thu, Aug 15, 2019 at 2:27 PM Becky Jackson [email protected] wrote:

The root node should be rdf:rdf - @jamesaoverton https://github.com/jamesaoverton

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1018?email_source=notifications&email_token=AAAMMOL3JOPX5M5R6XVHWKDQEXC5DA5CNFSM4IKVL6SKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4NBQ4I#issuecomment-521803889, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAMMOPFBR2CCVRUVRJKF3TQEXC5DANCNFSM4IKVL6SA .

cmungall avatar Aug 20 '19 03:08 cmungall

This check should be expanded: the base ontology should be rdf/xml

However, we should also mandate that any imported ontology be at least some RDF format (turtle or xml).

Some ontologies have an .obo format file in their imports and this causes problems with OWL-based toolchains like Owlready2 (reported by SciBite cc @simonjupp )

For more on this particular instance of the issue: https://github.com/HUPO-PSI/psi-ms-CV/issues/26

I am not sure how best to implement this in robot/owlapi

cmungall avatar Mar 11 '20 16:03 cmungall

This check should be expanded: the base ontology should be rdf/xml

I never knew we were going that way, but it makes sense with Jena in mind. Is there any hope to propose to merge release ontologies in general? having imports makes proper versioning really difficult to manage..

matentzn avatar Mar 11 '20 17:03 matentzn

Update. @cmungall requirement seems necessary. Is there another way to validate imports in a different format.
We now have a dependency on how to verify the format of imports.

cc @bpeters42

wdduncan avatar May 12 '20 16:05 wdduncan

Does the file really need to be RDF/XML? Would not other OWL RDF formats be acceptable? (Maybe this should be a separate issue.)

ramonawalls avatar Jul 15 '20 18:07 ramonawalls

@ramonawalls related discussion: #360

balhoff avatar Jul 15 '20 19:07 balhoff