Ontospy Any idea how to improve the performance when handling large ontologies?

Any idea how to improve the performance when handling large ontologies?

Open leonqli opened this issue 6 years ago • 7 comments

Jun 13 '18 20:06 leonqli

Do you have any sample ontology in mind? I'm trying to wrap my head around this problem so it'd be useful to have some sample models for testing.

Dec 06 '18 23:12 lambdamusic

you may try with this ontology: http://bioportal.bioontology.org/ontologies/NCBITAXON

Dec 10 '18 16:12 leonqli

The main problem is that ontospy attempts to build the entire ontology model in memory, and that takes time if there are many classes and properties to correlated.

I've tried using threads, but with no real performance improvements as the main tasks (extract classes, properties, concepts etc..) tend to be reliant on each other.

For very large ontologies maybe it's more indicated to use a triplestore. Otherwise I'm kind of out of ideas here..

Jan 03 '19 12:01 lambdamusic

You may want to take look of https://pythonhosted.org/Owlready2/ It seems to having better performance on large ontologies.

Jan 03 '19 12:01 leonqli

Thanks! Looks like they use an ad-hoc back end, maybe that's it. Will look more into it Update: the back end is an optimized SQLite index eg view here

Jan 03 '19 12:01 lambdamusic

Yes, they use SQLite as backend. Do you think it is helpful for improving the performance of ontospy?

Jan 10 '19 17:01 leonqli

It's also not too difficult to load an ontology into Apache Fuseki Jena. The main issue is the non-Python dependency (Fuseki), but once the store is running it's easy to use rdflib to mediate querying.

May 17 '22 17:05 jclerman

Ontospy Ontospy copied to clipboard

Any idea how to improve the performance when handling large ontologies?

Ontospy
Ontospy copied to clipboard