boomer
boomer copied to clipboard
Merging 14 Ontologies (huge merge)
Hello,
I am trying to merge 14 ontologies at once with Boomer : DERMO, DO, HUGO, ICDO, IDO, IEDB, MESH, MFOMD, MPATH, NCIT, OBI, OGMS, ORPHANET and SCDO.
This is how I proceed :
- I compute the 91 LOGMAP alignments between every pair of ontologies (i.e. 91 = n(n-1)/2 with n=14)
- I convert and merge these alignments into a single ptable (Boomer format)
- I join all these ontologies into a single "union" OWL file (622K classes ~ 2.5 GB)
- I launch Boomer on the union OWL file and the single ptable (54K entries ~ 7 MB).
I have run various tests and it seems that when the ptable is too large, the problem becomes intractable.
By removing the MESH and NCIT (i.e. now I try to merge 12 ontologies), the resulting union ontology is only 81K classes (242 MB) and the ptable contains only 7K entries. In this case, Boomer ends with a result in 30 min (on a i7 - 1.90 GHz with 32 GB RAM).
But I also need the MESH and the NCIT ontologies to be included in my merge result.
Overall, I am wondering if that's the correct way to proceed ?
Here follow some questions :
-
Should I continue with this strategy ? -> Should I keep trying to merge all at once ? In order to give Boomer complete decision power on selecting the best mappings (without introducing any bias)...
-
Or should I change my merging strategy ? -> Should I split the problem into smaller sub-problems -> Then organize them in some order (according to some criteria) : this could introduce some bias... -> And launch Boomer following this order.
For example, I could try this : - I convert the 91 alignments into 91 ptables (instead of converting and merging them into 1 single ptable) - For each of the 91 ptables ----> I launch Boomer with this ptable and the union OWL file. ----> In the union OWL file, I add all the equivalence axioms generated by Boomer for this ptable.
So far, it seems to work much faster. But the problem is the arbitrary order in the for-loop that is introducing a bias : since each equivalence axiom added at one step will influence Boomer results in the next steps.
Any suggestions ?
Oliver
PS : I couldn't attach the Boomer input union ontology (compressed ~ 140 MB) since the maximum attachment size is 25 MB. However, the input ptable is here ptable-91-mappings.zip .