Use a single merged ontology model
The advantage of using such model is that it will allow inference across all ontologies that we use.
It has multiple advantages:
- remove redundant ontologies from memory like BFO and RO
- use a single index for searching
- remove need for collecting and merging parents or children from multiple ontologies
One downside is that we lose some flexibility such as managing inference capabilities and search per ontology. We notably disable such feature on CheBI.
To achieve this, we have to add capabilities in baseCode for merging ontology models or use Jena directly in Gemma.
It's also a question whether such a large ontology would even be loadable into Protege.
We need to be able to use Protege to edit our own ontology. Even with just a few ontologies (EFO is the largest by far), protege takes over 6GB ram on my laptop, and it takes a while to open which is annoying.
Maybe there is a workaround or some setting that reduces this (e.g. disabling the reasoner), but managing our extensions of the ontologies is one of the only solid reasons I can think of for importing them all. The other things you list are nice but it may not be practical. I guess we can try and see what happens...
Re Chebi see also https://github.com/PavlidisLab/Gemma/issues/748
I agree, it would be silly to bloat TGEMO for this. The merge doesn't have to be done using import statements, it can be done programmatically by adding submodels to a larger model that we create on the server.
Something important I noticed when working with Jena: ontologies generally do not import RO, but extensively use its terms. If we are to use a single model, importing RO would be important for inferring sub-properties and get the parents/children right.
We can now use TDB with baseCode that we populate with all the ontologies we need.