goatools
goatools copied to clipboard
Consider support for obographs json format
Note that we now have a proposed JSON representation of OBO that would obviate the need for special purpose parsers. Your comments as a developer would be most welcome:
- https://github.com/geneontology/obographs/
See also this post describing motivation
- https://douroucouli.wordpress.com/2016/10/04/a-developer-friendly-json-exchange-format-for-ontologies/
@cmungall We will definitely add using the gene ontologies from a JSON representation. This is great news and should make the runs faster. Do you know when the Gene Ontology Consortium might have their gene ontologies available in JSON format? Is there anything we can do to help?
We don't want to start making official releases yet, as we may want to change the format. For now there are converters, but this is not ideal for you as it involves an external dependency
Thanks @cmungall . Yes, the external dependencies would be an issue for our users. We are very much looking forward to using the json files. Please let us know if you need any code, tests, or testing from us.
FYI: Our speed tests show that reading GO DAGs from a JSON format is 287% faster than reading ASCII text from an obo file.
We ran the tests using the available go-plus.json which contains more extensive information than the smaller ASCII text go-basic.obo that is most frequently used in current GOEAs by the GOATOOLS user community. In our speed calculations, we accounted for the different different file sizes of go-plus.json and go-basic.obo due to the go-plus.json being information-rich, while the go-basic.obo is lean but extremely functional.
We look forward to supporting GO DAGs available in the JSON format.
@cmungall
Haven't talked with you in a while since we wrote up the GOATOOLS paper.
I'm cleaning up some old issues in the repo... I just found that we had a dialog back then on supporting the GO DAG in JSON format. I wonder if this format is stable and your thoughts on whether you would like to see some support in goatools
(Python GO library) or you have other libraries covering this format and we could simply import. cc @dvklopfenstein
yes, it would be great to have support
the json format is simple and you should be able to use easily, but we will also have some autogenerated python classes for it soon
Also experimenting with some sql based representations that can efficiently be stored in sqlite/postgres, might be even faster to load from these